reproducibilityindex.ai

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts

Authors: Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei Efros, Moritz Hardt

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experimentally validate our method in the context of object recognition on several standard benchmarks. These include images with diverse types of corruption at various levels (Hendrycks & Dietterich, 2019), video frames of moving objects (Shankar et al., 2019), and a new test set of unknown shifts collected by (Recht et al., 2018). Our algorithm makes substantial improvements under distribution shifts, while maintaining the same performance on the original distribution.
Researcher Affiliation	Academia	1University of California, Berkeley 2University of California, San Diego 3MH is a paid consultant for Twitter. Correspondence to: Yu Sun <yusun@berkeley.edu>.
Pseudocode	No	The paper contains mathematical equations describing the model and optimization problems but no structured pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	Project website: https://test-time-training.github.io/.
Open Datasets	Yes	We use Res Nets (He et al., 2016b), which are constructed differently for CIFAR-10 (Krizhevsky & Hinton, 2009) and Image Net (Russakovsky et al., 2015).
Dataset Splits	No	For CIFAR-10, the paper states '50K images for training, and 10K images for testing,' lacking an explicit validation split. For ImageNet, it notes '1.2M images for training and the 50K validation images are used as the test set,' which combines validation and test rather than providing a distinct validation set for hyperparameter tuning.
Hardware Specification	No	The paper does not specify any particular CPU or GPU models, or other hardware specifications used for running the experiments, only mentioning 'Res Nets' and 'Group Normalization'.
Software Dependencies	No	The paper mentions general software components like 'stochastic gradient descent' and 'Group Normalization' but does not provide specific version numbers for any libraries, frameworks (e.g., PyTorch, TensorFlow), or other software dependencies.
Experiment Setup	Yes	For Test-Time Training (Equation 3), we use stochastic gradient descent with the learning rate set to that of the last epoch during training, which is 0.001 in all our experiments. We set weight decay and momentum to zero during Test-Time Training... For the standard version of Test-Time Training, we take ten gradient steps... For online version of Test-Time Training, we take only one gradient step... We use random crop and random horizontal ﬂip for data augmentation.