Stochastic positional embeddings improve masked image modeling
Authors: Amir Bar, Florian Bordes, Assaf Shocher, Mido Assran, Pascal Vincent, Nicolas Ballas, Trevor Darrell, Amir Globerson, Yann Lecun
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 4. Experiments and Results |
| Researcher Affiliation | Collaboration | 1Tel Aviv University 2UC Berkeley 3Meta AI (FAIR) 4Now also at Google Research 5New York University. |
| Pseudocode | Yes | Algorithm 1 MIM w/ Sto P pseudo-code. requires only a minor implementation change, highlighted in light gray. |
| Open Source Code | Yes | 1See https://github.com/amirbar/Sto P for code. |
| Open Datasets | Yes | Image Net (IN-1k) (Russakovsky et al., 2015), Places 205 (Zhou et al., 2014a), i Naturalist 2018 (Van Horn et al., 2018), and CIFAR 100 (Krizhevsky, 2009). |
| Dataset Splits | No | The paper states training on the full IN-1k dataset for a certain number of epochs and evaluating via linear probing on subsets like '1% of IN-1k', but it does not explicitly define standard train/validation/test splits with percentages or sample counts for the main model training process. |
| Hardware Specification | Yes | Here we pretrain all models for 300 epochs using 4 V100 nodes, on a total batch size of 2048. |
| Software Dependencies | No | The paper mentions 'optimizer Adam W' but does not provide specific version numbers for software libraries, programming languages, or other dependencies like Python, PyTorch, or CUDA. |
| Experiment Setup | Yes | The paper provides detailed pretraining settings in tables (e.g., Table 9, 10, 11, 12), including optimizer ('Adam W'), epochs (300/600), learning rate, weight decay, batch size (2048), learning rate schedule, warmup epochs, predictor depth, attention heads, embedding dimension, and the noise hyperparameter σ. |