Lightweight Vision Transformer with Bidirectional Interaction
Authors: Qihang Fan, Huaibo Huang, Xiaoqiang Zhou, Ran He
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on multiple vision tasks demonstrate that FAT achieves impressive performance.4 Experiments We conducted experiments on a wide range of vision tasks, including image classification on Image Net-1K [9], object detection and instance segmentation on COCO 2017 [33], and semantic segmentation on ADE20K [80]. |
| Researcher Affiliation | Academia | Qihang Fan 1,2, Huaibo Huang1, Xiaoqiang Zhou1,3, Ran He1,2 1MAIS & CRIPAC, Institute of Automation, Chinese Academy of Sciences, Beijing, China 2School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, China 3University of Science and Technology of China, Hefei, China |
| Pseudocode | No | The paper provides mathematical equations and architectural diagrams to illustrate its components, but it does not include any sections explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured, code-like steps for its methods. |
| Open Source Code | No | The paper does not include an explicit statement about releasing source code, nor does it provide a link to any code repository for the described methodology. |
| Open Datasets | Yes | We conducted experiments on a wide range of vision tasks, including image classification on Image Net-1K [9], object detection and instance segmentation on COCO 2017 [33], and semantic segmentation on ADE20K [80]. |
| Dataset Splits | Yes | We train our models on Image Net-1K [9] from scratch. and We conducted experiments on the COCO 2017 dataset [33]. |
| Hardware Specification | Yes | The CPU is Intel i9Core and the GPU is V100. |
| Software Dependencies | No | The paper mentions software tools like 'MMDetection', 'MMSegmentation', 'Adam W', 'Rand Augment', 'Mixup', 'Cut Mix', and 'Random Erasing' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | All models are trained for 300 epochs from scratch. To train the models, we use the Adam W optimizer with a cosine decay learning rate scheduler and 20 epoch linear warm-up. We set the initial learning rate, weight decay, and batch size to 0.001, 0.05, and 1024, respectively. |