reproducibilityindex.ai

Boosting Vanilla Lightweight Vision Transformers via Re-parameterization

Authors: Zhentao Tan, Xiaodan Li, Yue Wu, Qi Chu, Le Lu, Nenghai Yu, Jieping Ye

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments demonstrate that our proposed method not only boosts the performance of vanilla Vit-Tiny on various vision tasks to new state-of-the-art (SOTA) but also shows promising generality ability on other networks.
Researcher Affiliation	Collaboration	Alibaba Cloud1, Alibaba Group2, University of Science and Technology of China3, East China Normal University4
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	Code will be available.
Open Datasets	Yes	we pre-train our lightweight Vi T models on Image Net (Deng et al., 2009) which contains about 1.2M training images. We validate the performance on downstream tasks including image classification on Image Net(Deng et al., 2009), semantic image segmentation on ADE20K (Zhou et al., 2019), object detection and instance segmentation on MS COCO (Lin et al., 2014).
Dataset Splits	Yes	We validate the performance on downstream tasks including image classification on Image Net(Deng et al., 2009), semantic image segmentation on ADE20K (Zhou et al., 2019), object detection and instance segmentation on MS COCO (Lin et al., 2014).
Hardware Specification	Yes	Efficiency comparison between pre-training and inference on V100 GPUs.
Software Dependencies	No	The paper mentions "Py Torch-style implementation", "Adam W optimizer", "BEi T semantic segmentation codebase", and "detectron2 codebase" but does not provide specific version numbers for these software components.
Experiment Setup	Yes	We use Adam W optimizer (Loshchilov & Hutter, 2017) (with the initial learning rate 2.4e-3, weight decay 0.05, and batch size 4096) to train the model for 300 epochs. and Table 4: Fine-tuning settings of Vi T-Tiny for Image Net classification.