reproducibilityindex.ai

Prioritized Level Replay

Authors: Minqi Jiang, Edward Grefenstette, Tim Rocktäschel

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate PLR on several PCG environments with various combinations of scoring functions and prioritization schemes, and compare to the most common direct level sampling baseline of Ptrain(l\|Λtrain) = Uniform(l; Λtrain). We train and test on all 16 environments in the Procgen Benchmark on easy and hard difficulties, but focus discussion on the easy results, which allow direct comparison to several prior studies.
Researcher Affiliation	Collaboration	1Facebook AI Research, London, United Kingdom 2University College London, London, United Kingdom. Correspondence to: Minqi Jiang <msj@fb.com>.
Pseudocode	Yes	Algorithm 1 Policy-gradient training loop with PLR; Algorithm 2 Experience collection with PLR
Open Source Code	Yes	Our code is available at https://github.com/ facebookresearch/level-replay.
Open Datasets	Yes	We evaluate PLR on several PCG environments... We train and test on all 16 environments in the Procgen Benchmark... For Procgen, we use the same Res Block architecture as Cobbe et al. (2020a) and train for 25M total steps on 200 levels on the easy setting as in the original baselines.
Dataset Splits	No	The paper mentions training and testing but does not explicitly describe validation dataset splits. It evaluates performance on 'unseen test levels'.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running the experiments.
Software Dependencies	No	The paper mentions using 'PPO with GAE for training' but does not specify versions for PPO, GAE, or any other software libraries or programming languages.
Experiment Setup	Yes	For Procgen, we use the same Res Block architecture as Cobbe et al. (2020a) and train for 25M total steps on 200 levels on the easy setting as in the original baselines. For Mini Grid, we use a 3-layer CNN architecture based on Igl et al. (2019), and provide approximately 1000 levels of each difficulty per environment during training. Detailed descriptions of the environments, architectures, and hyperparameters used in our experiments (and how they were set or obtained) can be found in Appendix A.