reproducibilityindex.ai

Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees

Authors: Dohyeong Kim, Taehyun Cho, Seungyub Han, Hojun Chung, Kyungjae Lee, Songhwai Oh

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Furthermore, the proposed method has been evaluated on continuous control tasks and showed the best performance among other RCRL algorithms satisfying the constraints. The experiments are conducted on the Safety Gymnasium tasks [14] with a single constraint and the legged robot locomotion tasks [15] with multiple constraints.
Researcher Affiliation	Academia	1Dep. of Electrical and Computer Engineering, Seoul National University 2Dep. of Statistics, Korea University
Pseudocode	Yes	An overview of SRCPO is presented in Algorithm 1, and a detailed pseudo-code of the proposed method is described in Algorithm 2.
Open Source Code	Yes	Our code is available at https://github.com/rllab-snu/Spectral-Risk-Constrained-RL.
Open Datasets	Yes	The experiments are conducted on the Safety Gymnasium tasks [14] with a single constraint and the legged robot locomotion tasks [15] with multiple constraints.
Dataset Splits	No	The paper describes the tasks and data collection, but does not explicitly provide specific training, validation, and test dataset splits with percentages or counts.
Hardware Specification	Yes	All experiments were conducted on a PC whose CPU and GPU are an Intel Xeon E5-2680 and NVIDIA TITAN Xp, respectively.
Software Dependencies	No	The paper mentions software components like 'quantile distributional critics' and 'truncated normal distribution' but does not provide specific version numbers for libraries or frameworks used (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	The hyperparameters and network structure of each algorithm are detailed in Appendix D. Table 1: Details of network structures. Table 2: Description on hyperparameter settings.