reproducibilityindex.ai

Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps

Authors: Benjamin Ellis, Matthew T Jackson, Andrei Lupu, Alexander D. Goldie, Mattie Fellows, Shimon Whiteson, Jakob Foerster

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Evaluating Adam-Rel in both on-policy and off-policy RL, we demonstrate improved performance in both Atari and Craftax.
Researcher Affiliation	Academia	Benjamin Ellis University of Oxford Matthew T. Jackson University of Oxford Andrei Lupu University of Oxford Alexander D. Goldie University of Oxford Mattie Fellows University of Oxford Shimon Whiteson University of Oxford Jakob N. Foerster University of Oxford
Pseudocode	Yes	Algorithm 1 Pseudocode for PPO with Adam, Adam Rel, and Adam-MR.
Open Source Code	Yes	For the Atari experiments (both DQN and PPO), we based our implementation on Clean RL [19]. This code is available here. For the Craftax experiments, we based our implementation on Pure Jax RL [20]. This code is available here.
Open Datasets	Yes	To do so, we first train DQN [18, 19] agents with Adam-Rel on the Atari-10 benchmark for 40M frames... extensively evaluate our method s impact on PPO [4, 19, 20], training agents on Craftax-Classic-1B [12] ... and the Atari-572 suite [13] for 40 million frames.
Dataset Splits	No	The paper mentions using standard benchmarks like Atari, but it does not explicitly describe the dataset splits (e.g., percentages or sample counts for training, validation, and testing) within its text.
Hardware Specification	Yes	Experiments were performed on an internal cluster of NVIDIA V100 GPUs.
Software Dependencies	No	The paper mentions using "Clean RL" and "Pure Jax RL" as base implementations but does not specify version numbers for these or any other software libraries or frameworks.
Experiment Setup	Yes	We provide details of our hyperparameter settings in Appendix F, as well as detailing our experimental setup in Section 5.