Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Stochastic positional embeddings improve masked image modeling
Authors: Amir Bar, Florian Bordes, Assaf Shocher, Mido Assran, Pascal Vincent, Nicolas Ballas, Trevor Darrell, Amir Globerson, Yann Lecun
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4. Experiments and Results |
| Researcher Affiliation | Collaboration | 1Tel Aviv University 2UC Berkeley 3Meta AI (FAIR) 4Now also at Google Research 5New York University. |
| Pseudocode | Yes | Algorithm 1 MIM w/ Sto P pseudo-code. requires only a minor implementation change, highlighted in light gray. |
| Open Source Code | Yes | 1See https://github.com/amirbar/Sto P for code. |
| Open Datasets | Yes | Image Net (IN-1k) (Russakovsky et al., 2015), Places 205 (Zhou et al., 2014a), i Naturalist 2018 (Van Horn et al., 2018), and CIFAR 100 (Krizhevsky, 2009). |
| Dataset Splits | No | The paper states training on the full IN-1k dataset for a certain number of epochs and evaluating via linear probing on subsets like '1% of IN-1k', but it does not explicitly define standard train/validation/test splits with percentages or sample counts for the main model training process. |
| Hardware Specification | Yes | Here we pretrain all models for 300 epochs using 4 V100 nodes, on a total batch size of 2048. |
| Software Dependencies | No | The paper mentions 'optimizer Adam W' but does not provide specific version numbers for software libraries, programming languages, or other dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The paper provides detailed pretraining settings in tables (e.g., Table 9, 10, 11, 12), including optimizer ('Adam W'), epochs (300/600), learning rate, weight decay, batch size (2048), learning rate schedule, warmup epochs, predictor depth, attention heads, embedding dimension, and the noise hyperparameter σ. |