Uncertainty-Based Offline Reinforcement Learning with Diversified Q-Ensemble
Authors: Gaon An, Seungyong Moon, Jang-Hyun Kim, Hyun Oh Song
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our proposed method on D4RL benchmarks [9] and verify that the proposed method outperforms the previous state-of-the-art by a large margin on various types of environments and datasets. |
| Researcher Affiliation | Collaboration | Seoul National University1 Neural Processing Research Center2 Deep Metrics3 |
| Pseudocode | Yes | Algorithm 1 Ensemble-Diversified Actor Critic (EDAC) |
| Open Source Code | Yes | The code is available online3. 3https://github.com/snu-mllab/EDAC |
| Open Datasets | Yes | We evaluate our proposed methods against the previous offline RL algorithms on the standard D4RL benchmark [9]. |
| Dataset Splits | No | The paper uses standard D4RL benchmarks but does not explicitly provide the specific train/validation/test dataset splits (e.g., percentages or sample counts) used for their experiments. |
| Hardware Specification | Yes | We run our experiments on a single machine with one RTX 3090 GPU |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies such as Python, PyTorch, or other libraries used in the implementation. |
| Experiment Setup | Yes | Appendix B Implementation Details: We use Adam optimizer with learning rates 3e-4 and 1e-4 for Q-functions and actors respectively. The batch size is 256. The networks for both Q-functions and actor are MLPs with two hidden layers of size 256 and ReLU activation. |