reproducibilityindex.ai

Iterative Reachability Estimation for Safe Reinforcement Learning

Authors: Milan Ganai, Zheng Gong, Chenning Yu, Sylvia Herbert, Sicun Gao

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the proposed methods on a diverse suite of safe RL environments from Safety Gym, Py Bullet, and Mu Jo Co, and show the benefits in improving both reward performance and safety compared with state-of-the-art baselines.
Researcher Affiliation	Academia	Milan Ganai UC San Diego mganai@ucsd.edu Zheng Gong UC San Diego zhgong@ucsd.edu Chenning Yu UC San Diego chy010@ucsd.edu Sylvia Herbert UC San Diego sherbert@ucsd.edu Sicun Gao UC San Diego sicung@ucsd.edu
Pseudocode	Yes	Algorithm 1 RESPO Actor Critic
Open Source Code	Yes	To ensure a fair comparison, the primal-dual based approaches and unconstrained Vanilla PPO were implemented based off of the same code base [59].
Open Datasets	No	The paper mentions evaluating on 'Safety Gym [30]', 'Safety Py Bullet [50]', and 'Safety Mu Jo Co [51]' environments. While these are widely used, the paper cites the frameworks/engines themselves and does not provide specific access information (links, DOIs, or formal citations for the datasets used within these simulation environments, if applicable) nor does it claim they are publicly available datasets. These are simulation environments rather than static datasets.
Dataset Splits	Yes	Total Env Interactions 9e6, Number Seeds per algorithm per experiment 5.
Hardware Specification	Yes	We run our experiments on Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz with 6 cores.
Software Dependencies	No	The paper mentions implementing approaches 'based off of the same code base [59]' (PPO Lagrangian Pytorch) and '[60]' (Omnisafe). However, it does not explicitly list specific version numbers for software dependencies such as Python, PyTorch, or other libraries used for the experiments, which are necessary for reproducible descriptions.
Experiment Setup	Yes	Table 2: Hyperparameter Settings Details