reproducibilityindex.ai

MAST: Masked Augmentation Subspace Training for Generalizable Self-Supervised Priors

Authors: Chen Huang, Hanlin Goh, Jiatao Gu, Joshua M. Susskind

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that MAST consistently improves generalization on various downstream tasks, while being task-agnostic and efﬁcient during SSL.4 EXPERIMENTS Main results. We start with evaluating our MAST method for SSL on Image Net. Table 1 shows the results from both linear and semi-supervised evaluations, using the same optimization procedures of our baseline VICReg (Bardes et al., 2022).
Researcher Affiliation	Industry	Chen Huang, Hanlin Goh, Jiatao Gu & Josh Susskind Apple Inc. {chen-huang,hanlin,jgu32,jsusskind}@apple.com
Pseudocode	No	The paper describes the method using prose and mathematical equations but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about releasing source code or a link to a code repository.
Open Datasets	Yes	Our pretraining is mostly performed on the unlabeled Image Net dataset (Deng et al., 2009)... The same pretraining protocol is followed: pretraining Res Net-18 for 200 epochs on STL-10 dataset (Coates et al., 2011)...
Dataset Splits	No	The paper mentions using '1% and 10% Image Net samples' for semi-supervised classification and 'training set' for linear classification, but it does not provide explicit percentages for training, validation, and test splits for the main experiments, particularly for a separate validation set.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies	No	The paper mentions the use of 'LARS optimizer' but does not specify any software dependencies with version numbers, such as programming languages, libraries, or frameworks.
Experiment Setup	Yes	Loss coefﬁcients in Eq. (7) are set as α = 25, β = 1 following VICReg (Bardes et al., 2022), and we set λ = 25d/K, λ1 = 600/(d K) and λ2 = 25 to generate comparable loss magnitudes. Training details: The training protocol follows that in (Bardes et al., 2022): with batch size 2048, the LARS optimizer (You et al., 2017) runs for 1000 epochs with weight decay 10 6 and learning rate 1.6. The learning rate follows a cosine annealing schedule (Loshchilov & Hutter, 2017).