Test-Time Training with Masked Autoencoders

Authors: Yossi Gandelsman, Yu Sun, Xinlei Chen, Alexei Efros

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, our simple method improves generalization on many visual benchmarks for distribution shifts. Theoretically, we characterize this improvement in terms of the bias-variance trade-off.
Researcher Affiliation Collaboration Yossi Gandelsman UC Berkeley Yu Sun UC Berkeley Xinlei Chen Meta AI Alexei A. Efros UC Berkeley
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code and models are available at https://yossigandelsman.github.io/ttt_mae/index.html.
Open Datasets Yes Image Net-1k [9]...Image Net-C [22]...Image Net-A [23]...Image Net-R [21]...The Portraits dataset [16]
Dataset Splits Yes It also obtains higher accuracy than linear fine-tuning without aggressive augmentations on the Image Net validation set. ... We sort the entire dataset by year and split it into four equal parts, with 5062 images each. ... for each, we train on one of the first three splits and test on the fourth.
Hardware Specification Yes Most experiments are performed on four NVIDIA A100 GPUs; hyper-parameter sweeps are ran on an industrial cluster with V100 GPUs.
Software Dependencies No The paper mentions software components like 'Adam W' and 'SGD' and refers to models from other papers, but does not specify version numbers for any software dependencies (e.g., Python, PyTorch, specific library versions).
Experiment Setup Yes TTT is performed with SGD, as discussed, for 20 steps, using a momentum of 0.9, weight decay of 0.2, batch size of 128, and fixed learning rate of 5e-3.