reproducibilityindex.ai

Green Hierarchical Vision Transformer for Masked Image Modeling

Authors: Lang Huang, Shan You, Mingkai Zheng, Fei Wang, Chen Qian, Toshihiko Yamasaki

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on the Image Net-1K [60] (BSD 3-Clause License) image classiﬁcation dataset and MS-COCO [47] (CC BY 4.0 License) object detection/instance segmentation dataset.
Researcher Affiliation	Collaboration	1The University of Tokyo; 2Sense Time Research; 3The University of Sydney
Pseudocode	Yes	Algorithm 1 Optimal Grouping
Open Source Code	Yes	Corresponding author. Code and pre-trained models: https://github.com/Layne H/Green MIM.
Open Datasets	Yes	We conduct experiments on the Image Net-1K [60] (BSD 3-Clause License) image classiﬁcation dataset and MS-COCO [47] (CC BY 4.0 License) object detection/instance segmentation dataset.
Dataset Splits	Yes	We ﬁne-tune the pre-trained models on the Image Net-1K dataset and report the results on the validation set in Table 2. All models are ﬁne-tuned on the MS-COCO [47] 2017 train split (~118k images) and ﬁnally evaluated on the val split (~5k images).
Hardware Specification	Yes	All the experiments of our method are performed on a single machine with eight 32G Tesla V100 GPUs
Software Dependencies	Yes	CUDA 10.1, Py Torch [54] 1.8
Experiment Setup	Yes	The models are trained for 100/200/400/800 epochs with a total batch size of 2,048. We use the Adam W optimizer [41] with the cosine annealing schedule [50]. We set the base learning rate to 1.5e 4, the weight decay to 0.05, the hyper-parameters of Adam β1 = 0.9, β2 = 0.999, the number of warmup epochs to 40 with an initial base learning rate 1.5e 7.