reproducibilityindex.ai

Learning by Directional Gradient Descent

Authors: David Silver, Anirudh Goyal, Ivo Danihelka, Matteo Hessel, Hado van Hasselt

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 EXPERIMENTS We now report and discuss the results of an empirical study that analyses the performance of the proposed estimator using different tasks, as well as using different ways to approximate the expected gradient. We use JAX (Bradbury et al., 2018) to implement all experiments.
Researcher Affiliation	Collaboration	01 Deep Mind, London, UK, 2 University College London, 3 Mila, University of Montreal.
Pseudocode	Yes	Listing 1: DODGE implemented in JAX.
Open Source Code	No	The paper provides an example implementation in Listing 1 and states, "We use JAX (Bradbury et al., 2018) to implement all experiments." However, it does not explicitly provide a link to the authors' full source code for the methodology or state that their code is being released.
Open Datasets	Yes	We evaluate the proposed DODGE update on different problems. We ﬁrst give a brief description of the different problems... Copy task. The copy problem deﬁned in Graves et al. (2014)... MNIST classiﬁcation task. It is a database of handwritten digits (Le Cun, 1998)... Inﬂuence Balancing task. This task was introduced by Tallec & Ollivier (2017)... Image regression Ne RF task. This task trains the initial parameters of a 2D-Ne RF model (Mildenhall et al., 2020)... We build upon the experimental setup proposed by Tancik et al. (2021).
Dataset Splits	No	The paper states, "For each method, we choose the best learning rate from {0.003, 0.001, 0.0003, 0.0001, 0.00003, 0.00001}, based on the ﬁnal performance." This implies some form of validation for hyperparameter tuning, but it does not specify explicit dataset splits (e.g., percentages or counts) for training, validation, or testing for any of the datasets used.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory configurations. It only mentions using JAX for implementation.
Software Dependencies	No	The paper mentions using "JAX (Bradbury et al., 2018)" and the "Adam optimizer (Kingma & Ba, 2014)", and an "LSTM network (Hochreiter & Schmidhuber, 1997)". However, it does not specify version numbers for these software components or libraries, which are necessary for reproducible software dependencies.
Experiment Setup	Yes	On sequence modeling tasks, we use an LSTM network (Hochreiter & Schmidhuber, 1997) with 128 units and a batch size of 32. We optimize the log-likelihood using the Adam optimizer (Kingma & Ba, 2014). For each method, we choose the best learning rate from {0.003, 0.001, 0.0003, 0.0001, 0.00003, 0.00001}, based on the ﬁnal performance. We repeat each experiment 5 times with 5 different random seeds.