reproducibilityindex.ai

Learning Curve Prediction with Bayesian Neural Networks

Authors: Aaron Klein, Stefan Falkner, Jost Tobias Springenberg, Frank Hutter

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	3 EXPERIMENTS
Researcher Affiliation	Academia	Aaron Klein, Stefan Falkner, Jost Tobias Springenberg & Frank Hutter Department of Computer Science University of Freiburg {kleinaa,sfalkner,springj,fh}@cs.uni-freiburg.de
Pseudocode	No	The paper describes methods and processes but does not include a dedicated pseudocode block or a clearly labeled 'Algorithm X' section.
Open Source Code	No	The paper does not contain any statement about making its source code available or provide a link to a code repository.
Open Datasets	Yes	CNN: We sampled 256 conﬁgurations of 5 different hyperparameters of a 3-layer convolutional neural network (CNN) and trained each of them for 40 epochs on the CIFAR10 (Krizhevsky, 2009) benchmark. FCNet: We sampled 4096 conﬁgurations of 10 hyperparameters of a 2-layer feed forward neural network (FCNet) on MNIST (Le Cun et al., 2001)...
Dataset Splits	Yes	To estimate how well Bayesian neural networks perform in this task, we used the datasets from Section 3.1 and split all of them into 16 folds, allowing us to perform cross-validation of the predictive performance.
Hardware Specification	No	The paper does not specify any particular hardware components (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies	No	The paper refers to various methods and tools (e.g., 'probabilistic back propagation', 'SGLD', 'SGHMC', 'random forests', 'emcee') by citing their originating papers, but it does not specify the version numbers of any software libraries, frameworks, or dependencies used in their implementation.
Experiment Setup	Yes	For both networks, we used a 3-layer architecture with tanh activations and 64 units per layer. We also evaluate two different sampling methods for both types of networks: stochastic gradient Langevin dynamics (SGLD) and stochastic gradient Hamiltonian MCMC (SGHMC), following the approach of Springenberg et al. (2016) to automatically adapt the noise estimate and the preconditioning of the gradients. and Table 2: Hyperparameter conﬁguration space of the four different iterative methods. For the FCNet we decayed the learning rate by a αdecay = (1 + γ t) κ and also sampled different values for γ and κ.