Test Time Adaptation via Conjugate Pseudo-labels
Authors: Sachin Goyal, Mingjie Sun, Aditi Raghunathan, J. Zico Kolter
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, our approach consistently dominates other TTA alternatives over a wide range of domain adaptation benchmarks. Our approach is particularly of interest when applied to classifiers trained with novel loss functions, e.g., the recently-proposed Poly Loss [25] function, where it differs substantially from (and outperforms) an entropy-based loss. Further, we show that our conjugate based approach can also be interpreted as a kind of self-training using a very specific soft label, which we refer to as the conjugate pseudo-label. Overall, our method provides a broad framework for better understanding and improving test-time adaptation. Code is available at https://github.com/locuslab/ tta_conjugate. |
| Researcher Affiliation | Collaboration | Sachin Goyal 1 Mingjie Sun 1 Aditi Raghunathan1 Zico Kolter1,2 1Carnegie Mellon University, 2Bosch Center for AI {sachingo, mingjies, raditi, zkolter}@cs.cmu.edu |
| Pseudocode | Yes | The full procedure for test time adaptation via conjugate pseudo-labels is shown in Algorithm 1. (Algorithm 1 is presented on page 6). |
| Open Source Code | Yes | Code is available at https://github.com/locuslab/ tta_conjugate. |
| Open Datasets | Yes | We evaluate on the three common corruption benchmarks: adapting a classifier trained on CIFAR-10 to CIFAR-10-C, CIFAR-100 to CIFAR-100-C and Image Net to Image Net-C [15]. ... We also evaluate on three domain adaptation datasets: adapting a classifier trained on SVHN to MNIST, an Image Net classifier to Image Net-R [16] and adapting from synthetic to real data in VISDA-C [38]. |
| Dataset Splits | Yes | We tune the learning rate (LR) and temperature (T) on the validation noises in the corruption benchmark by grid-search. LR is selected from {1e 1, 1e 2, . . . 1e 4} and T from {1, 2 . . . 5}. All the experiments have been performed on A6000 GPU s. |
| Hardware Specification | Yes | All the experiments have been performed on A6000 GPU s. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies (e.g., libraries like PyTorch, TensorFlow, or specific Python versions). |
| Experiment Setup | Yes | We tune the learning rate (LR) and temperature (T) on the validation noises in the corruption benchmark by grid-search. LR is selected from {1e 1, 1e 2, . . . 1e 4} and T from {1, 2 . . . 5}. ... Following [50] and [40], we fine-tune by updating the learnable scale and shift parameters of the batch normalization layers across all adaptation losses. For each batch, batch normalization statistics is also updated, as suggested in [41]. |