Multi-Scale Dense Networks for Resource Efficient Image Classification

Authors: Gao Huang, Danlu Chen, Tianhong Li, Felix Wu, Laurens van der Maaten, Kilian Weinberger

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on three image-classification tasks demonstrate that our framework substantially improves the existing state-of-the-art in both settings.
Researcher Affiliation Collaboration Gao Huang Cornell University Danlu Chen Fudan University Tianhong Li Tsinghua University Felix Wu Cornell University Laurens van der Maaten Facebook AI Research Kilian Weinberger Cornell University
Pseudocode No The paper does not contain any clearly labeled "Pseudocode" or "Algorithm" blocks. Figures depict network architectures, but not procedural steps in a pseudocode format.
Open Source Code Yes Code to reproduce all results is available at https://anonymous-url.
Open Datasets Yes We evaluate the effectiveness of our approach on three image classification datasets, i.e., the CIFAR10, CIFAR-100 (Krizhevsky & Hinton, 2009) and ILSVRC 2012 (Image Net; Deng et al. (2009)) datasets.
Dataset Splits Yes The two CIFAR datasets contain 50,000 training and 10,000 test images of 32x32 pixels; we hold out 5,000 training images as a validation set." and "The Image Net dataset comprises 1,000 classes, with a total of 1.2 million training images and 50,000 validation images. We hold out 50,000 images from the training set to estimate the confidence threshold for classifiers in MSDNet.
Hardware Specification No The paper discusses computational resources and CPU time but does not specify any particular hardware components such as GPU models, CPU models, or memory details used for running the experiments. It only mentions general terms like 'CPU time'.
Software Dependencies No The paper mentions using "the framework of Gross & Wilber (2016)" for training models, which is Torch, but does not provide specific version numbers for this framework or any other software dependencies like programming languages or libraries (e.g., Python version, PyTorch/TensorFlow version).
Experiment Setup Yes On the two CIFAR datasets, all models (including all baselines) are trained using stochastic gradient descent (SGD) with mini-batch size 64. We use Nesterov momentum with a momentum weight of 0.9 without dampening, and a weight decay of 10-4. All models are trained for 300 epochs, with an initial learning rate of 0.1, which is divided by a factor 10 after 150 and 225 epochs. We apply the same optimization scheme to the Image Net dataset, except that we increase the mini-batch size to 256, and all the models are trained for 90 epochs with learning rate drops after 30 and 60 epochs.