GhostNetV2: Enhance Cheap Operation with Long-Range Attention

Authors: Yehui Tang, Kai Han, Jianyuan Guo, Chang Xu, Chao Xu, Yunhe Wang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the superiority of Ghost Net V2 over existing architectures. For example, it achieves 75.3% top-1 accuracy on Image Net with 167M FLOPs, significantly suppressing Ghost Net V1 (74.5%) with a similar computational cost. and In this section, we empirically investigate the proposed Ghost Net V2 model. We conduct experiments on the image classification task with the large-scale Image Net dataset [5]. To validate its generalization, we use Ghost Net V2 as backbone and embed it into a light-weight object detection scheme YOLOV3 [26]. Models with different backbone are compared on MS COCO dataset [20]. At last, we conduct extensive ablation experiments for better understanding Ghost Net V2.
Researcher Affiliation Collaboration Yehui Tang1,2, Kai Han2, Jianyuan Guo2,3, Chang Xu3, Chao Xu1, Yunhe Wang2 1School of Artificial Intelligence, Peking University 2Huawei Noah s Ark Lab 3School of Computer Science, University of Sydney
Pseudocode No No structured pseudocode or algorithm blocks are provided in the paper.
Open Source Code Yes The source code will be available at https://github.com/huawei-noah/Efficient-AI-Backbones/ tree/master/ghostnetv2_pytorch and https://gitee.com/mindspore/ models/tree/master/research/cv/ghostnetv2.
Open Datasets Yes The classification experiments are conducted on the benchmark Image Net (ILSVRC 2012) dataset, which contains 1.28M training images and 50K validation images from 1000 classes. We follow the training setting in [8] and report results with single crop on Image Net dataset. and The experiments are conducted on MS COCO 2017 dataset, composing of 118k training images and 5k validation images. and We conduct semantic segmentation experiments on ADE20K [43]
Dataset Splits Yes The classification experiments are conducted on the benchmark Image Net (ILSVRC 2012) dataset, which contains 1.28M training images and 50K validation images from 1000 classes. and The experiments are conducted on MS COCO 2017 dataset, composing of 118k training images and 5k validation images.
Hardware Specification Yes For an intuitive understanding, we equip the Ghost Net model with the self-attention used in Mobile Vi T [23] and measure the latency on Huawei P30 (Kirin 980 CPU) with TFLite tool. and The practical latency is measured on Huawei P30 (Kirin 980 CPU) with TFLite tool.
Software Dependencies No The paper mentions software like Py Torch [25], Mind Spore [15], and TFLite tool [4], but it does not specify their version numbers.
Experiment Setup Yes We follow the training setting in [8] and report results with single crop on Image Net dataset. and Specifically, based on the pre-trained weights on Image Net, the models are fine-tuned with SGD optimizer for 30 epochs. The batchsize is set to 192 and initial learning to 0.003. The experiments are conducted with input resolutions 320 320. and From the pre-trained weights on Image Net, the models are fine-tuned for 160000 iterations with crop size 512 512.