reproducibilityindex.ai

Stochastic positional embeddings improve masked image modeling

Authors: Amir Bar, Florian Bordes, Assaf Shocher, Mido Assran, Pascal Vincent, Nicolas Ballas, Trevor Darrell, Amir Globerson, Yann Lecun

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4. Experiments and Results
Researcher Affiliation	Collaboration	1Tel Aviv University 2UC Berkeley 3Meta AI (FAIR) 4Now also at Google Research 5New York University.
Pseudocode	Yes	Algorithm 1 MIM w/ Sto P pseudo-code. requires only a minor implementation change, highlighted in light gray.
Open Source Code	Yes	1See https://github.com/amirbar/Sto P for code.
Open Datasets	Yes	Image Net (IN-1k) (Russakovsky et al., 2015), Places 205 (Zhou et al., 2014a), i Naturalist 2018 (Van Horn et al., 2018), and CIFAR 100 (Krizhevsky, 2009).
Dataset Splits	No	The paper states training on the full IN-1k dataset for a certain number of epochs and evaluating via linear probing on subsets like '1% of IN-1k', but it does not explicitly define standard train/validation/test splits with percentages or sample counts for the main model training process.
Hardware Specification	Yes	Here we pretrain all models for 300 epochs using 4 V100 nodes, on a total batch size of 2048.
Software Dependencies	No	The paper mentions 'optimizer Adam W' but does not provide specific version numbers for software libraries, programming languages, or other dependencies like Python, PyTorch, or CUDA.
Experiment Setup	Yes	The paper provides detailed pretraining settings in tables (e.g., Table 9, 10, 11, 12), including optimizer ('Adam W'), epochs (300/600), learning rate, weight decay, batch size (2048), learning rate schedule, warmup epochs, predictor depth, attention heads, embedding dimension, and the noise hyperparameter σ.