Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning

Authors: Honghao Wei, Xiyue Peng, Arnob Ghosh, Xin Liu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Additionally, we offer a practical version of WSAC and compare it with existing state-of-the-art safe offline RL algorithms in several continuous control environments. WSAC outperforms all baselines across a range of tasks, supporting the theoretical results.
Researcher Affiliation Academia Honghao Wei Washington State University honghao.wei@wsu.edu Xiyue Peng Shanghai Tech University pengxy2024@shanghaitech.edu.cn Arnob Ghosh New Jersey Institute of Technology arnob.ghosh@njit.edu Xin Liu Shanghai Tech University liuxin7@shanghaitech.edu.cn
Pseudocode Yes Algorithm 1 Weighted Safe Actor-Critic (WSAC)
Open Source Code Yes Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We provide the data and code.
Open Datasets Yes We use the offline dataset from Liu et al. (2019), where the corresponding expert policy are used to interact with the environments and collect the data.
Dataset Splits No The paper uses an offline dataset but does not specify explicit training, validation, or test dataset splits.
Hardware Specification Yes We run all the experiments with NVIDIA Ge Force RTX 3080 Ti 8 Core Processor.
Software Dependencies No The paper mentions using ADAM for optimization but does not provide specific version numbers for programming languages, libraries, or other software dependencies.
Experiment Setup Yes Table 3: Hyperparameters of WSAC