reproducibilityindex.ai

Information-Directed Exploration for Deep Reinforcement Learning

Authors: Nikolay Nikolov, Johannes Kirschner, Felix Berkenkamp, Andreas Krause

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our method on Atari games and demonstrate a signiﬁcant improvement over alternative approaches.
Researcher Affiliation	Academia	Nikolay Nikolov Imperial College London, ETH Zurich nikolay.nikolov14@imperial.ac.uk Johannes Kirschner, Felix Berkenkamp, Andreas Krause ETH Zurich {jkirschner, befelix}@inf.ethz.ch, krausea@ethz.ch
Pseudocode	Yes	Algorithm 1 Deterministic Information-Directed Q-learning
Open Source Code	Yes	Our code can be found at https:// github.com/nikonikolov/rltf/tree/ids-drl.
Open Datasets	Yes	We now provide experimental results on 55 of the Atari 2600 games from the Arcade Learning Environment (ALE) (Bellemare et al., 2013), simulated via the Open AI gym interface (Brockman et al., 2016).
Dataset Splits	No	Every 1M training frames, learning is frozen, the agent is evaluated for 500K frames and performance is computed as the average episode return from this latest evaluation run.
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned.
Software Dependencies	No	The paper mentions 'Adam optimizer' and 'Open AI gym interface' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	Table 2: ALE hyperparameters lists numerous specific parameters such as "λ 0.1", "mini-batch size 32", "learning rate α 0.00005", "target network update frequency 40000", etc.