Searching the Search Space of Vision Transformer

Authors: Minghao Chen, Kan Wu, Bolin Ni, Houwen Peng, Bei Liu, Jianlong Fu, Hongyang Chao, Haibin Ling

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experiments on Image Net verify the proposed automatic search space design method can improve the effectiveness of design space, thus boosting the performance of searched architectures. The discovered models achieve superior performance compared to the recent Vi T [10] and Swin [27] transformer families under aligned settings. Moreover, the experiments on downstream vision and vision-language tasks, such as object detection, semantic segmentation and visual question answering, demonstrate the generality of the method.
Researcher Affiliation Collaboration Minghao Chen1, , Kan Wu2, , Bolin Ni3, , Houwen Peng4, , Bei Liu4, Jianlong Fu4, Hongyang Chao2, Haibin Ling1 1Stony Brook University, 2Sun Yat-sen University, 3Institute of Automation, CAS, 4Microsoft Research
Pseudocode Yes Algorithm 1 Searching the Search Space
Open Source Code No Code and models will be available at here. (in abstract) AND Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No]
Open Datasets Yes The experiments on Image Net verify the proposed automatic search space design method can improve the effectiveness of design space, thus boosting the performance of searched architectures. (ImageNet [9])... We conduct experiments on COCO [24] for object detection, ADE20K [56] for semantic segmentation and VQA 2.0 [54] for visual question answering, respectively.
Dataset Splits No The paper mentions Dtrain and Dval, and Image Net validation set, but does not specify the explicit percentages or counts for training/validation/test splits.
Hardware Specification Yes The models are trained for 300 epochs with 16 Nvidia Tesla 32G V100 GPUs.
Software Dependencies No The paper mentions 'Adam W [28] optimizer' but does not specify any software dependencies with version numbers.
Experiment Setup Yes Similar to Dei T [40], we train the supernet with the following settings: Adam W [28] optimizer with weight decay 0.05, initial learning rate 1 10 3 and minimal learning rate 1 10 5 with cosine scheduler, 20 epochs warmup, batch size of 1024, 0.1 label smoothing, and stochastic depth with drop rate 0.2.