Accelerating Augmentation Invariance Pretraining

Authors: Jinhong Lin, Cheng-En Wu, Yibing Wei, Pedro Morgado

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct all experiments on the Image Net dataset [9] using a Vi T-Base transformer backbone for both the online and target encoders. Ablations and parametric studies are conducted on the Image Net-100 (IN100) dataset, a randomly chosen subset of 100 classes from Image Net.
Researcher Affiliation Academia Jinhong Lin Cheng-En Wu* Yibing Wei Pedro Morgado University of Wisconsin Madison {jlin522, cwu356, wei96, pmorgado}@wisc.edu
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The code will be released after the paper accepted.
Open Datasets Yes We conduct all experiments on the Image Net dataset [9] using a Vi T-Base transformer backbone for both the online and target encoders. Ablations and parametric studies are conducted on the Image Net-100 (IN100) dataset, a randomly chosen subset of 100 classes from Image Net.
Dataset Splits Yes We adhere to the class partitioning used in previous studies [33, 29].
Hardware Specification No As for Image Net-1k experiments, we followed the official training hyperparameters except for batch size, which was set to 1024 due to hardware limitations.
Software Dependencies No The paper does not specify software dependencies with version numbers.
Experiment Setup Yes To establish an optimized baseline on Image Net-100, we empirically search for the learning rate, batch size and the required training budget (default values were used for other hyper-parameters). We observed that performance saturated for batch sizes of 512, and training budgets equivalent to 1000 epochs with a 40-epoch warmup phase. The optimal base learning rate was 5 10 4, adjusted by the batch size scaling rule [11].