reproducibilityindex.ai

Split to Be Slim: An Overlooked Redundancy in Vanilla Convolution

Authors: Qiulin Zhang, Zhuqing Jiang, Qishuo Lu, Jia'nan Han, Zhengxin Zeng, Shanghua Gao, Aidong Men

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To show the effectiveness of the proposed SPConv, in this section, we conduct experiments with only the widely-used 3 3 kernels being replaced by our SPConv modules.
Researcher Affiliation	Academia	Qiulin Zhang1 , Zhuqing Jiang1 , Qishuo Lu1 , Jia nan Han1 , Zhengxin Zeng1 , Shang-Hua Gao2 , Aidong Men1 1Beijing University of Posts and Telecommunications 2Nankai University {qiulinzhang, jiangzhuqing, hanjianan, zengzhengxinsice, menad}@bupt.edu.cn , shgao@mail.nankai.edu.cn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks (e.g., clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code	No	The paper does not provide concrete access to its source code, nor does it explicitly state that the code for its methodology is released or available.
Open Datasets	Yes	Firstly, we perform small scale image classiﬁcation experiments on the CIFAR-10 dataset [Krizhevsky et al., 2009] with Res Net-20 [He et al., 2016] and VGG-16 [Simonyan and Zisserman, 2015] architectures. Then we experiment a large scale 1000-class single label classiﬁcation task on Image Net-2012 [Deng et al., 2009] with Res Net-50 [He et al., 2016] architecture. To explore SPConv s generality further, we also conduct a multi-label object detection experiment on MS COCO dataset [Lin et al., 2014b].
Dataset Splits	Yes	For fair comparisons, all models in each experiment, including reimplemented baselines and SPConv-equipped models, are trained from scratch on 4 NVIDIA Tesla V100 GPUs with the default data augmentation and training strategy which are optimized for vanilla convolution and no other tricks are used. Therefore, our proposed SPConv may achieve better performance with extensive hyper-parameter searches. More ablation studies are performed on small scale CIFAR-10 dataset. ... models are trained on the COCO trainval35k set and tested on the left 5K minival set.
Hardware Specification	Yes	For fair comparisons, all models in each experiment, including reimplemented baselines and SPConv-equipped models, are trained from scratch on 4 NVIDIA Tesla V100 GPUs with the default data augmentation and training strategy which are optimized for vanilla convolution and no other tricks are used. ... Inference time is tested on a single NVIDIA Tesla V100 with NVIDIA DALI as data pipelines.
Software Dependencies	No	The paper mentions "NVIDIA DALI project" and "apex [Micikevicius et al., 2018]" and "mmdection [Chen et al., 2019a]", but does not provide specific version numbers for any of these software components.
Experiment Setup	Yes	Optimization is performed using SGD with weight decay = 5e-4, batch-size = 128, initial learning rate = 0.1 which is divided by 10 every 50 epochs. ... With the default settings, the learning rate starts at 0.1 and decays by a factor of 10 every 30 epochs, using synchronous SGD with weight decay 1e-4, momentum 0.9 and a mini-batch of 256 to train the model from scratch for 90 epochs.