reproducibilityindex.ai

Understanding and Preventing Capacity Loss in Reinforcement Learning

Authors: Clare Lyle, Mark Rowland, Will Dabney

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present a rigorous empirical analysis of this phenomenon which considers both the ability of networks to learn new target functions via gradientbased optimization methods, and their ability to linearly disentangle states feature representations.
Researcher Affiliation	Collaboration	Clare Lyle Department of Computer Science University of Oxford Mark Rowland & Will Dabney Deep Mind
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Agent: We train a Rainbow agent (Hessel et al., 2018) with the same architecture and hyperparameters as are described in the open-source implementation made available by Quan & Ostrovski (2020). URL http://github.com/deepmind/dqn_zoo.
Open Datasets	Yes	To evaluate Hypothesis 1, we construct a series of toy iterative prediction problems on the MNIST data set, a widely-used computer vision benchmark which consists of images of handwritten digits and corresponding labels.
Dataset Splits	Yes	Training: We follow the training procedure found in the Rainbow implementation mentioned above. We train for 200 million frames, with 500K evaluation frames interspersed every 1M training frames.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies	No	The paper mentions software like the 'Jax Haiku library' and refers to specific algorithms like DQN, QR-DQN, Rainbow, and DDQN, but does not provide specific version numbers for these libraries or frameworks.
Experiment Setup	Yes	The evaluations in Figure 5 are for k = 10 heads with β = 100 and α = 0.1, and we show the method s robustness to these hyperparameters in Appendix C.1.