Adam on Local Time: Addressing Nonstationarity in RL with Relative Adam Timesteps
Authors: Benjamin Ellis, Matthew T Jackson, Andrei Lupu, Alexander D. Goldie, Mattie Fellows, Shimon Whiteson, Jakob Foerster
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluating Adam-Rel in both on-policy and off-policy RL, we demonstrate improved performance in both Atari and Craftax. |
| Researcher Affiliation | Academia | Benjamin Ellis University of Oxford Matthew T. Jackson University of Oxford Andrei Lupu University of Oxford Alexander D. Goldie University of Oxford Mattie Fellows University of Oxford Shimon Whiteson University of Oxford Jakob N. Foerster University of Oxford |
| Pseudocode | Yes | Algorithm 1 Pseudocode for PPO with Adam, Adam Rel, and Adam-MR. |
| Open Source Code | Yes | For the Atari experiments (both DQN and PPO), we based our implementation on Clean RL [19]. This code is available here. For the Craftax experiments, we based our implementation on Pure Jax RL [20]. This code is available here. |
| Open Datasets | Yes | To do so, we first train DQN [18, 19] agents with Adam-Rel on the Atari-10 benchmark for 40M frames... extensively evaluate our method s impact on PPO [4, 19, 20], training agents on Craftax-Classic-1B [12] ... and the Atari-572 suite [13] for 40 million frames. |
| Dataset Splits | No | The paper mentions using standard benchmarks like Atari, but it does not explicitly describe the dataset splits (e.g., percentages or sample counts for training, validation, and testing) within its text. |
| Hardware Specification | Yes | Experiments were performed on an internal cluster of NVIDIA V100 GPUs. |
| Software Dependencies | No | The paper mentions using "Clean RL" and "Pure Jax RL" as base implementations but does not specify version numbers for these or any other software libraries or frameworks. |
| Experiment Setup | Yes | We provide details of our hyperparameter settings in Appendix F, as well as detailing our experimental setup in Section 5. |