reproducibilityindex.ai

Adapting Self-Supervised Vision Transformers by Probing Attention-Conditioned Masking Consistency

Authors: Viraj Prabhu, Sriram Yenamandra, Aaditya Singh, Judy Hoffman

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our simple approach leads to consistent performance gains over competing methods that use Vi Ts and self-supervised initializations on standard object recognition benchmarks. Our code is available at https://github.com/virajprabhu/PACMAC. ... We evaluate PACMAC on three classiﬁcation benchmarks for domain adaptation... Tables1, 2, and 3, present results.
Researcher Affiliation	Academia	Viraj Prabhu Sriram Yenamandra Aaditya Singh Judy Hoffman {virajp,sriramy,asingh,judy}@gatech.edu Georgia Institute of Technology
Pseudocode	Yes	Algorithm 1 Attention-conditioned Masking ... Algorithm 2 PACMAC Optimization
Open Source Code	Yes	Our code is available at https://github.com/virajprabhu/PACMAC.
Open Datasets	Yes	We evaluate PACMAC on three classiﬁcation benchmarks for domain adaptation: i) Ofﬁce Home [30]... ii) Domain Net [31]... iii) Vis DA2017 [32]...
Dataset Splits	Yes	In unsupervised domain adaptation (UDA) we are given access to labeled source instances (x S, y S) PS(X, Y), and unlabeled target instances x T PT (X)... For a target instance x T PT , we generate a committee of k masked versions. ... Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Sec. 4.2 and supplementary.
Hardware Specification	No	The paper mentions "All experiments use Py Torch [48]" but does not specify any hardware like GPUs or CPUs. The checklist indicates that the information is in supplementary material, but not in the main paper: "Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] See supplementary."
Software Dependencies	No	All experiments use Py Torch [48]. ... We use the Adam W [46] optimizer. ... We use Rand Augment [47]. No specific version numbers are provided for these software libraries, which is required for a reproducible description.
Experiment Setup	Yes	We pretrain on the combined source and target domain for 800 epochs (MAE) and 200 epochs (DINO). For pretraining, we linearly scale the learning rate to 4 10 4 (MAE) and 5 10 5 (DINO) during a 40 epoch warmup phase followed by a cosine decay. We use the Adam W [46] optimizer. For PACMAC, we use k = 2, mr = 0.75, T = 50%, and α = 0.1. We use Rand Augment [47] with N = 3 and M = 4.0 during pretraining and N = 1 and M = 2.0 during DA. On Ofﬁce Home and Domain Net, we ﬁnetune on the source and adapt for 100 epochs each, and perform 10 epochs of each phase on Vis DA. We use a learning rate of 2 10 4 and weight decay of 0.05.