reproducibilityindex.ai

Safe Reinforcement Learning via Curriculum Induction

Authors: Matteo Turchetta, Andrey Kolobov, Shital Shah, Andreas Krause, Alekh Agarwal

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments use this framework in two challenging environments to induce curricula for safe and efﬁcient learning.
Researcher Affiliation	Collaboration	Matteo Turchetta Department of Computer Science ETH Zurich matteotu@inf.ethz.ch Andrey Kolobov Microsoft Research Redmond, WA-998052 akolobov@microsoft.com Shital Shah Microsoft Research Redmond, WA-998052 shitals@microsoft.com Andreas Krause Department of Computer Science ETH Zurich krausea@ethz.ch Alekh Agarwal Microsoft Research Redmond, WA-998052 alekha@microsoft.com
Pseudocode	Yes	Algorithm 1 CISR
Open Source Code	Yes	We release an open source implementation of CISR and of our experiments2. 2https://github.com/zuzuba/CISR_NeurIPS20
Open Datasets	Yes	Frozen Lake and the Lunar Lander environments from Open AI Gym [10].
Dataset Splits	No	No. The paper describes the training process in RL environments (Frozen Lake and Lunar Lander) where data is generated through interaction, but it does not specify explicit training/validation/test dataset splits with percentages, sample counts, or citations to predefined splits. It mentions evaluating policies in the original environment but not how a static dataset would be split for validation.
Hardware Specification	No	No. The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies	No	No. The paper mentions using 'Stable Baselines [25] implementation of PPO [43]' and 'GP-UCB [44]' for optimization, but it does not provide specific version numbers for these software components or other dependencies.
Experiment Setup	Yes	For a detailed overview of the hyperparameters and the environments, see Appendices A and B.