reproducibilityindex.ai

Contrastive Tuning: A Little Help to Make Masked Autoencoders Forget

Authors: Johannes Lehner, Benedikt Alkin, Andreas Fürst, Elisabeth Rumetshofer, Lukas Miklautz, Sepp Hochreiter

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments and Analysis
Researcher Affiliation	Academia	1ELLIS Unit Linz and LIT AI Lab, Institute for Machine Learning Johannes Kepler University, Linz, Austria 2Faculty of Computer Science, University of Vienna, Vienna, Austria 3Uni Vie Doctoral School Computer Science, University of Vienna 4Institute of Advanced Research in Artificial Intelligence (IARAI) lehner@ml.jku.at, alkin@ml.jku.at, lukas.miklautz@univie.ac.at
Pseudocode	No	The paper describes methods in prose and figures (e.g., Figure 3), but does not contain a structured pseudocode or algorithm block.
Open Source Code	Yes	Project page: github.com/ml-jku/MAE-CT. We provide access to our code, model checkpoints and supplement on our project page: github.com/ml-jku/MAE-CT.
Open Datasets	Yes	Evaluation We evaluate our approach via image classification on Image Net (Deng et al. 2009)
Dataset Splits	Yes	Evaluation We evaluate our approach via image classification on Image Net (Deng et al. 2009), where we vary the number of used labels from 100% down to a single label per class. ... For evaluating the representation using 100% of the labels, we train a linear probe and a k-NN classifier. With 10% and 1% of the labels, we fine-tune the encoder and in the extreme low-shot settings (<1% labels), we report the accuracy of a logistic regression classifier averaged over three splits. The detailed protocols can be found in Supplement B.
Hardware Specification	Yes	We acknowledge Euro HPC Joint Undertaking for awarding us access to Karolina at IT4Innovations, Czech Republic and to Melu Xina at Lux Provide, Luxembourg.
Software Dependencies	No	The paper describes various model parameters, optimizers, and training schedules (e.g., 'MAE pre-training. We train for 1600 epochs with a learning rate of 1.5e 4 and use the normalize pixels variant of the MAE loss'), but does not specify software dependencies with version numbers.
Experiment Setup	Yes	Implementation Details We outline the most important implementation details and provide all further information in Supplement A. MAE pre-training. We train for 1600 epochs with a learning rate of 1.5e 4 and use the normalize pixels variant of the MAE loss, which applies a patch-wise normalization to the target pixels before the mean-squared-error loss. NNCLR initialization. Following (Dwibedi et al. 2021), we use a 3-layer MLP as projector, a 2-layer MLP as predictor and a queue Q of length 65536. To initialize the NNCLRhead, we train for 20 epochs on the output of the fully frozen pre-trained MAE encoder with a learning rate of 1e 4, a temperature τ of 0.15 and the default top1-NN lookup. Contrastive tuning. We use a learning rate of 1e 4 and apply layer-wise learning rate decay (Clark et al. 2020) with decay factor 0.65 to the upper half of the Vi T blocks while freezing the lower half. For MAE-CTmin, we train Vi T-B/L for 20 epochs and Vi T-H for 30 epochs. For MAE-CTaug we train Vi T-B for 80 epochs and Vi T-L/H for 40 epochs.