DASCO: Dual-Generator Adversarial Support Constrained Offline Reinforcement Learning
Authors: Quan Vuong, Aviral Kumar, Sergey Levine, Yevgen Chebotar
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, we demonstrate the benefits of our approach, DASCO, on standard benchmark tasks. For offline datasets that consist of a combination of expert, sub-optimal and noisy data, our method outperforms distribution-constrained offline RL methods by a large margin. 5 Experiments Our experiments aim at answering the following questions: |
| Researcher Affiliation | Collaboration | Quan Vuong1, Aviral Kumar1,2, Sergey Levine1,2, Yevgen Chebotar1 1Google Research, 2UC Berkeley |
| Pseudocode | Yes | Algorithm 1 DASCO algorithm summary |
| Open Source Code | No | The paper states 'We will include these results in the Appendix' in the checklist regarding code availability, which implies future provision rather than current release. No direct link or unambiguous statement of immediate release is found. |
| Open Datasets | Yes | We use the existing Ant Maze environments from the D4RL suite [11]: antmaze-medium and antmaze-large. ... [11] J. Fu, A. Kumar, O. Nachum, G. Tucker, and S. Levine. D4rl: Datasets for deep data-driven reinforcement learning. In ar Xiv, 2020. URL https://arxiv.org/pdf/2004.07219. |
| Dataset Splits | No | The paper states in its checklist that training details including data splits 'will be included in the Appendix,' but these details are not present in the provided main text. |
| Hardware Specification | No | The paper states in its checklist, 'Please see Appendix D' for compute and resource information, but no specific hardware details (like GPU or CPU models) are provided in the main text. |
| Software Dependencies | No | The paper mentions software like PyTorch and Adam optimizer in its references, but it does not provide a specific list of software dependencies with version numbers used for its experimental setup in the main text. |
| Experiment Setup | No | The paper states in its checklist that 'training details (e.g., data splits, hyperparameters, how they were chosen)' will be included in the Appendix, and refers to Appendix E for baseline hyperparameter tuning. However, no specific experimental setup details or hyperparameter values are provided in the main text. |