Deep Convolutional Neural Networks with Merge-and-Run Mappings

Authors: Liming Zhao, Mingjie Li, Depu Meng, Xi Li, Zhaoxiang Zhang, Yueting Zhuang, Zhuowen Tu, Jingdong Wang

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the performance on the standard recognition tasks. Our approach demonstrates consistent improvements over Res Nets with the comparable setup, and achieves competitive results (e.g., 3.06% testing error on CIFAR-10, 17.55% on CIFAR-100, 1.51% on SVHN) 1. ... 4 Experiments ... 4.3 Empirical Study ... 4.4 Comparison with State-of-the-Arts
Researcher Affiliation Collaboration Liming Zhao1, Mingjie Li2, Depu Meng2, Xi Li1 , Zhaoxiang Zhang3 Yueting Zhuang1, Zhuowen Tu4, Jingdong Wang5 1 Zhejiang University 2 University of Science and Technology of China 3 Institute of Automation, Chinese Academy of Sciences 4 UC San Diego 5 Microsoft Research
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes 1https://github.com/zlmzju/fusenet
Open Datasets Yes CIFAR-10 and CIFAR-100. The two datasets are drawn from the 80-million tiny image database [Krizhevsky, 2009]. ... SVHN (street view house numbers) dataset... We also compare our DMRNet-50 against the Res Net-98 with the same experimental settings on the Image Net 2012 classification dataset [Deng et al., 2009].
Dataset Splits No The paper mentions 50000 training images and 10000 test images for CIFAR-10/100, but does not explicitly provide a separate validation dataset split or its size.
Hardware Specification No We use SGD with the Nesterov momentum to train all the models for 400 epochs on CIFAR-10/CIFAR-100 and 40 epochs on SVHN, both with a total mini-batch size 64 on two GPUs.
Software Dependencies No Our implementation is based on MXNet [Chen et al., 2015].
Experiment Setup Yes We use SGD with the Nesterov momentum to train all the models for 400 epochs on CIFAR-10/CIFAR-100 and 40 epochs on SVHN, both with a total mini-batch size 64 on two GPUs. The learning rate starts with 0.1 and is reduced by a factor 10 at the 1/2, 3/4 and 7/8 fractions of the number of training epochs. Similar to [He et al., 2016a], the weight decay is 0.0001, the momentum is 0.9, and the weights are initialized as in [He et al., 2015].