Safe Reinforcement Learning via Curriculum Induction

Authors: Matteo Turchetta, Andrey Kolobov, Shital Shah, Andreas Krause, Alekh Agarwal

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments use this framework in two challenging environments to induce curricula for safe and efficient learning.
Researcher Affiliation Collaboration Matteo Turchetta Department of Computer Science ETH Zurich matteotu@inf.ethz.ch Andrey Kolobov Microsoft Research Redmond, WA-998052 akolobov@microsoft.com Shital Shah Microsoft Research Redmond, WA-998052 shitals@microsoft.com Andreas Krause Department of Computer Science ETH Zurich krausea@ethz.ch Alekh Agarwal Microsoft Research Redmond, WA-998052 alekha@microsoft.com
Pseudocode Yes Algorithm 1 CISR
Open Source Code Yes We release an open source implementation of CISR and of our experiments2. 2https://github.com/zuzuba/CISR_NeurIPS20
Open Datasets Yes Frozen Lake and the Lunar Lander environments from Open AI Gym [10].
Dataset Splits No No. The paper describes the training process in RL environments (Frozen Lake and Lunar Lander) where data is generated through interaction, but it does not specify explicit training/validation/test dataset splits with percentages, sample counts, or citations to predefined splits. It mentions evaluating policies in the original environment but not how a static dataset would be split for validation.
Hardware Specification No No. The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No No. The paper mentions using 'Stable Baselines [25] implementation of PPO [43]' and 'GP-UCB [44]' for optimization, but it does not provide specific version numbers for these software components or other dependencies.
Experiment Setup Yes For a detailed overview of the hyperparameters and the environments, see Appendices A and B.