reproducibilityindex.ai

Feasible Reachable Policy Iteration

Authors: Shentao Qin, Yujie Yang, Yao Mu, Jie Li, Wenjun Zou, Jingliang Duan, Shengbo Eben Li

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results verify the effectiveness of the proposed FR function in both improving the convergence speed of better or comparable performance without sacrificing safety and identifying a smaller policy space with higher sample efficiency. We test our algorithm on the frozen lake (gym) environment, two classical control tasks, and the safety gym benchmark.
Researcher Affiliation	Academia	1School of Vehicle and Mobility, Tsinghua University, Beijing, China 2Department of Computer Science, The University of Hong Kong, Hong Kong, China 3School of Mechanical Engineering, University of Science and Technology Beijing, Beijing, China.
Pseudocode	Yes	Algorithm 1 Feasible Reachable Region Identification and Algorithm 2 Feasible Reachable Policy Iteration (FRPI)
Open Source Code	No	The paper does not provide an explicit statement about releasing its source code or a direct link to a code repository.
Open Datasets	Yes	We compare the algorithms on four high-dimensional robot navigation tasks in Safety Gym (Ray et al., 2019)
Dataset Splits	No	The paper does not explicitly provide details about training, validation, and test dataset splits by percentage, sample counts, or a specific splitting methodology.
Hardware Specification	Yes	We conducted training on an NVIDIA GPU 3090 using JAX, setting XLA PYTHON CLIENT MEM FRACTION to 0.1, which allocates 2720 MB of GPU memory.
Software Dependencies	No	The paper mentions 'JAX' but does not specify a version number. Other software or library dependencies are not listed with specific version numbers.
Experiment Setup	Yes	The hyperparameters used in the experiments are listed in Tab. 4.