reproducibilityindex.ai

Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning

Authors: Honghao Wei, Xiyue Peng, Arnob Ghosh, Xin Liu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Additionally, we offer a practical version of WSAC and compare it with existing state-of-the-art safe offline RL algorithms in several continuous control environments. WSAC outperforms all baselines across a range of tasks, supporting the theoretical results.
Researcher Affiliation	Academia	Honghao Wei Washington State University honghao.wei@wsu.edu Xiyue Peng Shanghai Tech University pengxy2024@shanghaitech.edu.cn Arnob Ghosh New Jersey Institute of Technology arnob.ghosh@njit.edu Xin Liu Shanghai Tech University liuxin7@shanghaitech.edu.cn
Pseudocode	Yes	Algorithm 1 Weighted Safe Actor-Critic (WSAC)
Open Source Code	Yes	Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We provide the data and code.
Open Datasets	Yes	We use the offline dataset from Liu et al. (2019), where the corresponding expert policy are used to interact with the environments and collect the data.
Dataset Splits	No	The paper uses an offline dataset but does not specify explicit training, validation, or test dataset splits.
Hardware Specification	Yes	We run all the experiments with NVIDIA Ge Force RTX 3080 Ti 8 Core Processor.
Software Dependencies	No	The paper mentions using ADAM for optimization but does not provide specific version numbers for programming languages, libraries, or other software dependencies.
Experiment Setup	Yes	Table 3: Hyperparameters of WSAC