Network In Network

Authors: Min Lin; Qiang Chen; Shuicheng Yan

ICLR 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrated the state-of-the-art classification performances with NIN on CIFAR-10 and CIFAR-100, and reasonable performances on SVHN and MNIST datasets.
Researcher Affiliation Academia Min Lin1,2, Qiang Chen2, Shuicheng Yan2 1Graduate School for Integrative Sciences and Engineering 2Department of Electronic & Computer Engineering National University of Singapore, Singapore {linmin,chenqiang,eleyans}@nus.edu.sg
Pseudocode No The paper includes mathematical equations and architectural diagrams (e.g., Figure 1, Figure 2) but does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper states 'We implement our network on the super fast cuda-convnet code developed by Alex Krizhevsky [4].' This indicates the use of an existing open-source framework, but there is no explicit statement or link indicating that the authors have released their own implementation code for Network In Network.
Open Datasets Yes We evaluate NIN on four benchmark datasets: CIFAR-10 [12], CIFAR-100 [12], SVHN [13] and MNIST [1].
Dataset Splits Yes For this dataset, we use the last 10,000 images of the training set as validation data. (CIFAR-10) ... Namely 400 samples per class selected from the training set and 200 samples per class from the extra set are used for validation. (SVHN)
Hardware Specification No The paper states 'We implement our network on the super fast cuda-convnet code developed by Alex Krizhevsky [4].' While this implies the use of NVIDIA GPUs, no specific hardware models (e.g., GPU model, CPU type, memory) are mentioned for the experimental setup.
Software Dependencies No The paper mentions 'We implement our network on the super fast cuda-convnet code developed by Alex Krizhevsky [4].' However, no specific version numbers for cuda-convnet or any other software dependencies are provided.
Experiment Setup No The paper provides some general training procedures like 'The network is trained using mini-batches of size 128' and describes the learning rate adjustment process. It also mentions 'dropout is applied' and 'weight decay'. However, it explicitly states 'The detailed settings of the parameters are provided in the supplementary materials', meaning concrete hyperparameter values (e.g., initial learning rates, specific weight decay values) are not present in the main text of the paper.