Information-Directed Exploration for Deep Reinforcement Learning

Authors: Nikolay Nikolov, Johannes Kirschner, Felix Berkenkamp, Andreas Krause

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our method on Atari games and demonstrate a significant improvement over alternative approaches.
Researcher Affiliation Academia Nikolay Nikolov Imperial College London, ETH Zurich nikolay.nikolov14@imperial.ac.uk Johannes Kirschner, Felix Berkenkamp, Andreas Krause ETH Zurich {jkirschner, befelix}@inf.ethz.ch, krausea@ethz.ch
Pseudocode Yes Algorithm 1 Deterministic Information-Directed Q-learning
Open Source Code Yes Our code can be found at https:// github.com/nikonikolov/rltf/tree/ids-drl.
Open Datasets Yes We now provide experimental results on 55 of the Atari 2600 games from the Arcade Learning Environment (ALE) (Bellemare et al., 2013), simulated via the Open AI gym interface (Brockman et al., 2016).
Dataset Splits No Every 1M training frames, learning is frozen, the agent is evaluated for 500K frames and performance is computed as the average episode return from this latest evaluation run.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned.
Software Dependencies No The paper mentions 'Adam optimizer' and 'Open AI gym interface' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes Table 2: ALE hyperparameters lists numerous specific parameters such as "λ 0.1", "mini-batch size 32", "learning rate α 0.00005", "target network update frequency 40000", etc.