Information-Directed Exploration for Deep Reinforcement Learning
Authors: Nikolay Nikolov, Johannes Kirschner, Felix Berkenkamp, Andreas Krause
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our method on Atari games and demonstrate a significant improvement over alternative approaches. |
| Researcher Affiliation | Academia | Nikolay Nikolov Imperial College London, ETH Zurich nikolay.nikolov14@imperial.ac.uk Johannes Kirschner, Felix Berkenkamp, Andreas Krause ETH Zurich {jkirschner, befelix}@inf.ethz.ch, krausea@ethz.ch |
| Pseudocode | Yes | Algorithm 1 Deterministic Information-Directed Q-learning |
| Open Source Code | Yes | Our code can be found at https:// github.com/nikonikolov/rltf/tree/ids-drl. |
| Open Datasets | Yes | We now provide experimental results on 55 of the Atari 2600 games from the Arcade Learning Environment (ALE) (Bellemare et al., 2013), simulated via the Open AI gym interface (Brockman et al., 2016). |
| Dataset Splits | No | Every 1M training frames, learning is frozen, the agent is evaluated for 500K frames and performance is computed as the average episode return from this latest evaluation run. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned. |
| Software Dependencies | No | The paper mentions 'Adam optimizer' and 'Open AI gym interface' but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Table 2: ALE hyperparameters lists numerous specific parameters such as "λ 0.1", "mini-batch size 32", "learning rate α 0.00005", "target network update frequency 40000", etc. |