Split to Be Slim: An Overlooked Redundancy in Vanilla Convolution
Authors: Qiulin Zhang, Zhuqing Jiang, Qishuo Lu, Jia'nan Han, Zhengxin Zeng, Shanghua Gao, Aidong Men
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To show the effectiveness of the proposed SPConv, in this section, we conduct experiments with only the widely-used 3 3 kernels being replaced by our SPConv modules. |
| Researcher Affiliation | Academia | Qiulin Zhang1 , Zhuqing Jiang1 , Qishuo Lu1 , Jia nan Han1 , Zhengxin Zeng1 , Shang-Hua Gao2 , Aidong Men1 1Beijing University of Posts and Telecommunications 2Nankai University {qiulinzhang, jiangzhuqing, hanjianan, zengzhengxinsice, menad}@bupt.edu.cn , shgao@mail.nankai.edu.cn |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks (e.g., clearly labeled algorithm sections or code-like formatted procedures). |
| Open Source Code | No | The paper does not provide concrete access to its source code, nor does it explicitly state that the code for its methodology is released or available. |
| Open Datasets | Yes | Firstly, we perform small scale image classification experiments on the CIFAR-10 dataset [Krizhevsky et al., 2009] with Res Net-20 [He et al., 2016] and VGG-16 [Simonyan and Zisserman, 2015] architectures. Then we experiment a large scale 1000-class single label classification task on Image Net-2012 [Deng et al., 2009] with Res Net-50 [He et al., 2016] architecture. To explore SPConv s generality further, we also conduct a multi-label object detection experiment on MS COCO dataset [Lin et al., 2014b]. |
| Dataset Splits | Yes | For fair comparisons, all models in each experiment, including reimplemented baselines and SPConv-equipped models, are trained from scratch on 4 NVIDIA Tesla V100 GPUs with the default data augmentation and training strategy which are optimized for vanilla convolution and no other tricks are used. Therefore, our proposed SPConv may achieve better performance with extensive hyper-parameter searches. More ablation studies are performed on small scale CIFAR-10 dataset. ... models are trained on the COCO trainval35k set and tested on the left 5K minival set. |
| Hardware Specification | Yes | For fair comparisons, all models in each experiment, including reimplemented baselines and SPConv-equipped models, are trained from scratch on 4 NVIDIA Tesla V100 GPUs with the default data augmentation and training strategy which are optimized for vanilla convolution and no other tricks are used. ... Inference time is tested on a single NVIDIA Tesla V100 with NVIDIA DALI as data pipelines. |
| Software Dependencies | No | The paper mentions "NVIDIA DALI project" and "apex [Micikevicius et al., 2018]" and "mmdection [Chen et al., 2019a]", but does not provide specific version numbers for any of these software components. |
| Experiment Setup | Yes | Optimization is performed using SGD with weight decay = 5e-4, batch-size = 128, initial learning rate = 0.1 which is divided by 10 every 50 epochs. ... With the default settings, the learning rate starts at 0.1 and decays by a factor of 10 every 30 epochs, using synchronous SGD with weight decay 1e-4, momentum 0.9 and a mini-batch of 256 to train the model from scratch for 90 epochs. |