Lightweight Vision Transformer with Bidirectional Interaction

Authors: Qihang Fan, Huaibo Huang, Xiaoqiang Zhou, Ran He

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on multiple vision tasks demonstrate that FAT achieves impressive performance.4 Experiments We conducted experiments on a wide range of vision tasks, including image classification on Image Net-1K [9], object detection and instance segmentation on COCO 2017 [33], and semantic segmentation on ADE20K [80].
Researcher Affiliation Academia Qihang Fan 1,2, Huaibo Huang1, Xiaoqiang Zhou1,3, Ran He1,2 1MAIS & CRIPAC, Institute of Automation, Chinese Academy of Sciences, Beijing, China 2School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China 3University of Science and Technology of China, Hefei, China
Pseudocode No The paper provides mathematical equations and architectural diagrams to illustrate its components, but it does not include any sections explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured, code-like steps for its methods.
Open Source Code No The paper does not include an explicit statement about releasing source code, nor does it provide a link to any code repository for the described methodology.
Open Datasets Yes We conducted experiments on a wide range of vision tasks, including image classification on Image Net-1K [9], object detection and instance segmentation on COCO 2017 [33], and semantic segmentation on ADE20K [80].
Dataset Splits Yes We train our models on Image Net-1K [9] from scratch. and We conducted experiments on the COCO 2017 dataset [33].
Hardware Specification Yes The CPU is Intel i9Core and the GPU is V100.
Software Dependencies No The paper mentions software tools like 'MMDetection', 'MMSegmentation', 'Adam W', 'Rand Augment', 'Mixup', 'Cut Mix', and 'Random Erasing' but does not provide specific version numbers for these software components.
Experiment Setup Yes All models are trained for 300 epochs from scratch. To train the models, we use the Adam W optimizer with a cosine decay learning rate scheduler and 20 epoch linear warm-up. We set the initial learning rate, weight decay, and batch size to 0.001, 0.05, and 1024, respectively.