Adversarially Trained Weighted Actor-Critic for Safe Offline Reinforcement Learning
Authors: Honghao Wei, Xiyue Peng, Arnob Ghosh, Xin Liu
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Additionally, we offer a practical version of WSAC and compare it with existing state-of-the-art safe offline RL algorithms in several continuous control environments. WSAC outperforms all baselines across a range of tasks, supporting the theoretical results. |
| Researcher Affiliation | Academia | Honghao Wei Washington State University honghao.wei@wsu.edu Xiyue Peng Shanghai Tech University pengxy2024@shanghaitech.edu.cn Arnob Ghosh New Jersey Institute of Technology arnob.ghosh@njit.edu Xin Liu Shanghai Tech University liuxin7@shanghaitech.edu.cn |
| Pseudocode | Yes | Algorithm 1 Weighted Safe Actor-Critic (WSAC) |
| Open Source Code | Yes | Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: We provide the data and code. |
| Open Datasets | Yes | We use the offline dataset from Liu et al. (2019), where the corresponding expert policy are used to interact with the environments and collect the data. |
| Dataset Splits | No | The paper uses an offline dataset but does not specify explicit training, validation, or test dataset splits. |
| Hardware Specification | Yes | We run all the experiments with NVIDIA Ge Force RTX 3080 Ti 8 Core Processor. |
| Software Dependencies | No | The paper mentions using ADAM for optimization but does not provide specific version numbers for programming languages, libraries, or other software dependencies. |
| Experiment Setup | Yes | Table 3: Hyperparameters of WSAC |