reproducibilityindex.ai

Topological Experience Replay

Authors: Zhang-Wei Hong, Tao Chen, Yen-Chen Lin, Joni Pajarinen, Pulkit Agrawal

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically show that our method is substantially more data-efﬁcient than several baselines on a diverse range of goal-reaching tasks. Notably, the proposed method also outperforms baselines that consume more batches of training experience and operates from high-dimensional observational data such as images.
Researcher Affiliation	Academia	Improbable AI Lab, Massachusetts Institute of Technology1 Aalto University2
Pseudocode	Yes	Algorithm 1 Topological Experience Replay for Q-Learning
Open Source Code	Yes	Code is included in the zip ﬁle.
Open Datasets	Yes	We evaluate TER in Minigrid (Chevalier-Boisvert et al., 2018) and Sokoban (Schrader, 2018) and the references provide URLs: "Maxime Chevalier-Boisvert, Lucas Willems, and Suman Pal. Minimalistic gridworld environment for openai gym. https://github.com/maximecb/gym-minigrid, 2018." and "Max-Philipp B. Schrader. gym-sokoban. https://github.com/mp Schrader/ gym-sokoban, 2018."
Dataset Splits	No	The paper does not provide specific details on training, validation, and test dataset splits, as experiments are conducted in simulation environments (Minigrid, Sokoban) where new episodes are generated rather than using fixed dataset splits.
Hardware Specification	No	We are grateful to MIT Supercloud and the Lincoln Laboratory Supercomputing Center for providing HPC resources. However, specific details like GPU or CPU models were not mentioned.
Software Dependencies	No	The paper mentions using the 'pfrl codebase' and specific optimizers like 'Adam' and 'RMSProp', but does not provide version numbers for these software components.
Experiment Setup	Yes	Batch size, Optimizer, and Learning rate For all environments, we set batch size=64 for Minigrid, batch size=32 for Sokoban and Atari. The optimizers are Adam with learning rate=3e-4 for Minigrid and Sokoban. For Atari, we follow the conﬁguration in (Mnih et al., 2015) and use RMSProp optimizer with learning rate=2.5e-4, alpha=0.95, eps=1e-2, and momentum=0.0.