Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
All you need is a good init
Authors: Dmytro Mishkin, Jiri Matas
ICLR 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Performance is evaluated on Goog Le Net, Caffe Net, Fit Nets and Residual nets and the state-of-the-art, or very close to it, is achieved on the MNIST, CIFAR-10/100 and Image Net datasets. |
| Researcher Affiliation | Academia | Dmytro Mishkin, Jiri Matas Center for Machine Perception Czech Technical University in Prague Czech Republic EMAIL |
| Pseudocode | Yes | Algorithm 1 Layer-sequential unit-variance orthogonal initialization. L convolution or fullconnected layer, WL its weights, BL its output blob., Tolvar variance tolerance, Ti current trial, Tmax max number of trials. |
| Open Source Code | Yes | 1The code allowing to reproduce the experiments is available at https://github.com/ducha-aiki/LSUVinit |
| Open Datasets | Yes | Performance is evaluated on Goog Le Net, Caffe Net, Fit Nets and Residual nets and the state-of-the-art, or very close to it, is achieved on the MNIST, CIFAR-10/100 and Image Net datasets. |
| Dataset Splits | No | The paper mentions 'validation accuracy' (Figures 4 and 5) and uses standard datasets like MNIST (60,000 images) and CIFAR-10/100 (60,000 images), which typically have predefined splits. However, it does not explicitly state the specific train/validation/test split percentages, absolute counts, or reference the use of predefined splits with citations for reproduction. |
| Hardware Specification | No | The paper discusses computational time and overhead (Table 6, Section 5.4 'TIMINGS') but does not specify any particular hardware components (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions software systems like 'Caffe Net' (Jia et al. (2014)) but does not provide specific version numbers for any software dependencies required to reproduce the experiments. |
| Experiment Setup | Yes | The Fit Nets are trained with the stochastic gradient descent with momentum set to 0.9, the initial learning rate set to 0.01 and reduced by a factor of 10 after the 100th, 150th and 200th epoch, ๏ฌnishing at 230th epoch. |