BEiT: BERT Pre-Training of Image Transformers
Authors: Hangbo Bao, Li Dong, Songhao Piao, Furu Wei
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on image classification and semantic segmentation show that our model achieves competitive results with previous pre-training methods. |
| Researcher Affiliation | Collaboration | Hangbo Bao , Li Dong , Songhao Piao , Furu Wei Harbin Institute of Technology Microsoft Research |
| Pseudocode | Yes | Algorithm 1 Blockwise Masking |
| Open Source Code | Yes | https://github.com/microsoft/unilm |
| Open Datasets | Yes | We pretrain BEIT on the training set of Image Net-1K (Russakovsky et al., 2015), which contains about 1.2M images. |
| Dataset Splits | No | The paper mentions using 'training set of Image Net-1K' and evaluating on 'ILSVRC-2012 Image Net dataset' but does not provide specific training/validation/test split percentages or sample counts. |
| Hardware Specification | Yes | The 500k training steps take about five days using 16 Nvidia Telsa V100 32GB GPU cards. |
| Software Dependencies | No | The paper mentions optimizers (Adam) and other models (SETR-PUP) by reference, but does not specify the versions of software libraries or frameworks used (e.g., Python, PyTorch, CUDA versions). |
| Experiment Setup | Yes | The paper includes detailed hyperparameters in '2.5 PRE-TRAINING SETUP', 'G HYPERPARAMETERS FOR PRE-TRAINING (Table 12)', 'H HYPERPARAMETERS FOR IMAGE CLASSIFICATION FINE-TUNING (Table 13)', and 'I HYPERPARAMETERS FOR ADE20K SEMANTIC SEGMENTATION FINE-TUNING (Table 14)' sections, covering learning rates, batch sizes, optimizers, and other settings. |