reproducibilityindex.ai

Learning the Number of Neurons in Deep Networks

Authors: Jose M. Alvarez, Mathieu Salzmann

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we demonstrate the ability of our method to automatically determine the number of neurons on the task of large-scale classiﬁcation. To this end, we study three different architectures and analyze the behavior of our method on three different datasets, with a particular focus on parameter reduction. Below, we ﬁrst describe our experimental setup and then discuss our results.
Researcher Affiliation	Academia	Jose M. Alvarez Data61 @ CSIRO Canberra, ACT 2601, Australia jose.alvarez@data61.csiro.au Mathieu Salzmann CVLab, EPFL CH-1015 Lausanne, Switzerland mathieu.salzmann@epfl.ch
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks (clearly labeled algorithm sections or code-like formatted procedures).
Open Source Code	No	The paper does not include an unambiguous statement that the authors are releasing the code for the work described, nor does it provide a direct link to a source-code repository.
Open Datasets	Yes	For our experiments, we used two large-scale image classiﬁcation datasets, Image Net [Russakovsky et al., 2015] and Places2-401 [Zhou et al., 2015]. Furthermore, we conducted additional experiments on the character recognition dataset of [Jaderberg et al., 2014a].
Dataset Splits	Yes	We used the ILSVRC-2012 [Russakovsky et al., 2015] subset consisting of 1000 categories, with 1.2 million training images and 50,000 validation images. Finally, the ICDAR character recognition dataset of [Jaderberg et al., 2014a] consists of 185,639 training and 5,198 test samples split into 36 categories.
Hardware Specification	Yes	More speciﬁcally, for Image Net and Places2-401, we used the torch-7 multi-gpu framework [Collobert et al., 2011] on a Dual Xeon 8-core E5-2650 with 128GB of RAM using three Kepler Tesla K20m GPUs in parallel. All models were trained for a total of 55 epochs with 12, 000 batches per epoch and a batch size of 48 and 180 for BNet and Dec8, respectively. For ICDAR, we trained each network on a single Tesla K20m GPU for a total 45 epochs with a batch size of 256 and 1,000 iterations per epoch.
Software Dependencies	No	The paper mentions using the 'torch-7 multi-gpu framework [Collobert et al., 2011]' but does not provide a specific version number for Torch or any other software libraries or dependencies.
Experiment Setup	Yes	All models were trained for a total of 55 epochs with 12, 000 batches per epoch and a batch size of 48 and 180 for BNet and Dec8, respectively. The learning rate was set to an initial value of 0.01 and then multiplied by 0.1. Data augmentation was done through random crops and random horizontal ﬂips with probability 0.5. For ICDAR, we trained each network on a single Tesla K20m GPU for a total 45 epochs with a batch size of 256 and 1,000 iterations per epoch. In this case, the learning rate was set to an initial value of 0.1 and multiplied by 0.1 in the second, seventh and ﬁfteenth epochs. We used a momentum of 0.9. In terms of hyper-parameters, for large-scale classiﬁcation, we used λl = 0.102 for the ﬁrst three layers and λl = 0.255 for the remaining ones. For ICDAR, we used λl = 5.1 for the ﬁrst layer and λl = 10.2 for the remaining ones.