Dual-stream Network for Visual Recognition

Authors: Mingyuan Mao, peng gao, Renrui Zhang, Honghui Zheng, Teli Ma, Yan Peng, Errui Ding, Baochang Zhang, Shumin Han

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Without bells and whistles, the proposed DS-Net outperforms Dei T-Small by 2.4% in terms of top-1 accuracy on Image Net-1k and achieves state-of-the-art performance over other Vision Transformers and Res Nets. For object detection and instance segmentation, DS-Net-Small respectively outperforms Res Net-50 by 6.4% and 5.5 % in terms of m AP on MSCOCO 2017... 4 Experiments In this section, we first provide three ablation studies to explore the optimal structure of DS-Net and interpret the necessity of dual-stream design. Then we give the experimental results of image classification and downstream tasks including object detection and instance segmentation.
Researcher Affiliation Collaboration Mingyuan Mao1, , Peng Gao3, , Renrui Zhang2, , Honghui Zheng3, Teli Ma2, Yan Peng3, Errui Ding3, Baochang Zhang1,*, Shumin Han3,* 1Beihang University, Beijing, China 2Shanghai AI Laboratory, China 3Department of Computer Vision Technology (VIS), Baidu Inc
Pseudocode No No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code No The code will be released soon.
Open Datasets Yes We use Image Net-1K [8] for classification and MSCOCO 2017 [25] for object detection and instance segmentation.
Dataset Splits Yes Image Net-1K [8], comprising 1.28M training images and 50K validation images of 1000 classes. ... MSCOCO 2017 [25], containing 118K training images and 5K validation images.
Hardware Specification Yes All experiments are conducted on 8 V100 GPUs and the throughput is tested on 1 V100 GPU.
Software Dependencies No The paper does not provide specific version numbers for software dependencies (e.g., programming languages, libraries, or frameworks).
Experiment Setup Yes We train our model for 300 epochs by Adam W optimizer. The initial learning rate is set to 1e-3 and scheduled by the cosine strategy. ... As the standard 1 schedule(12 epochs), we adopt Adam W optimizer with initial learning rate of 1e-4, decayed by 0.1 at epoch 8 and 11. We set stochastic drop path regularization of 0.1 and weight decay of 0.05.