reproducibilityindex.ai

DropPos: Pre-Training Vision Transformers by Reconstructing Dropped Positions

Authors: Haochen Wang, Junsong Fan, Yuxi Wang, Kaiyou Song, Tong Wang, ZHAO-XIANG ZHANG

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations of Drop Pos show strong capabilities. Drop Pos outperforms supervised pre-training and achieves competitive results compared with state-of-the-art selfsupervised alternatives on a wide range of downstream benchmarks. The code is publicly available at https://github.com/Haochen-Wang409/Drop Pos.
Researcher Affiliation	Collaboration	Haochen Wang1,3 Junsong Fan1,4 Yuxi Wang1,4 Kaiyou Song2 Tong Wang2 Zhaoxiang Zhang1,3,4 1Center for Research on Intelligent Perception and Computing (CRIPAC), State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation, Chinese Academy of Sciences (CASIA) 2Megvii Technology 3University of Chinese Academy of Sciences (UCAS) 4Centre for Artificial Intelligence and Robotics, HKISI_CAS
Pseudocode	Yes	Algorithm 1 Pseudo-Code of Drop Pos.
Open Source Code	Yes	The code is publicly available at https://github.com/Haochen-Wang409/Drop Pos.
Open Datasets	Yes	We perform self-supervised pre-training on the Image Net-1K [48] training set with a resolution of 224x224.
Dataset Splits	Yes	We perform self-supervised pre-training on the Image Net-1K [48] training set with a resolution of 224x224. We report top-1 validation accuracy of a single 224x224 crop.
Hardware Specification	Yes	For Vi T-B/16, pre-training and fine-tuning are conducted with 64 and 32 2080Ti GPUs, respectively. For Vi T-L/16, pre-training and fine-tuning are conducted with 32 and 16 Tesla V100 GPUs, respectively. Experiments are conducted on 8 Tesla V100 GPUs.
Software Dependencies	No	The paper mentions software like Detectron2 [61], ViTDet [36], and MMSegmentation [14], but it does not specify the version numbers for these or any other key software components used in the experiments.
Experiment Setup	Yes	config pre-training fine-tuning optimizer Adam W Adam W base learning rate 1.5e-4 1e-3 weight decay 0.05 0.05 momentum β1, β2 = 0.9, 0.95 β1, β2 = 0.9, 0.999 layer-wise lr decay 1.0 0.8 batch size 4096 1024 learning rate schedule cosine decay cosine decay warmup epochs 10 (Vi T-B/16), 40 (Vi T-L/16) 5 training epochs 200 100 (Vi T-B/16), 50 (Vi T-L/16) augmentation Random Resized Crop Rand Aug (9, 0.5) [16] label smoothing 0.1 mixup [68] 0.8 cutmix [65] 1.0 drop path [31] 0.1