reproducibilityindex.ai

Reincarnating Reinforcement Learning: Reusing Prior Computation to Accelerate Progress

Authors: Rishabh Agarwal, Max Schwarzer, Pablo Samuel Castro, Aaron C. Courville, Marc Bellemare

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate reincarnating RL s gains over tabula rasa RL on Atari 2600 games, a challenging locomotion task, and the real-world problem of navigating stratospheric balloons.
Researcher Affiliation	Collaboration	1 Google Research, Brain Team 2 MILA
Pseudocode	No	The paper does not contain a pseudocode block or algorithm section.
Open Source Code	Yes	Open-sourced code and trained agents at agarwl.github.io/reincarnating_rl.
Open Datasets	Yes	We conduct experiments on ALE with sticky actions [57]. To reduce the computational cost of our experiments, we use a subset of 10 commonly-used Atari 2600 games: Asterix, Breakout, Space Invaders, Seaquest, Q Bert, Beam Rider, Enduro, Ms Pacman, Bowling and River Raid.
Dataset Splits	No	For the results in Section 4, we use 3 seeds per game on 10 games.
Hardware Specification	Yes	We obtain the teacher policy πT by running DQN [60] with Adam optimizer for 400 million environment frames, requiring 7 days of training per run with Dopamine [15] on P100 GPUs.
Software Dependencies	No	We use actor-critic agents in Acme [37].
Experiment Setup	Yes	For the experiments in Section 4, we use learning rate of 1e-4, Adam optimizer, a batch size of 32, a discount factor of 0.99, a target update period of 2000, replay buffer size of 1M, and an epsilon schedule of 250k frames.