DASCO: Dual-Generator Adversarial Support Constrained Offline Reinforcement Learning

Authors: Quan Vuong, Aviral Kumar, Sergey Levine, Yevgen Chebotar

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we demonstrate the benefits of our approach, DASCO, on standard benchmark tasks. For offline datasets that consist of a combination of expert, sub-optimal and noisy data, our method outperforms distribution-constrained offline RL methods by a large margin. 5 Experiments Our experiments aim at answering the following questions:
Researcher Affiliation Collaboration Quan Vuong1, Aviral Kumar1,2, Sergey Levine1,2, Yevgen Chebotar1 1Google Research, 2UC Berkeley
Pseudocode Yes Algorithm 1 DASCO algorithm summary
Open Source Code No The paper states 'We will include these results in the Appendix' in the checklist regarding code availability, which implies future provision rather than current release. No direct link or unambiguous statement of immediate release is found.
Open Datasets Yes We use the existing Ant Maze environments from the D4RL suite [11]: antmaze-medium and antmaze-large. ... [11] J. Fu, A. Kumar, O. Nachum, G. Tucker, and S. Levine. D4rl: Datasets for deep data-driven reinforcement learning. In ar Xiv, 2020. URL https://arxiv.org/pdf/2004.07219.
Dataset Splits No The paper states in its checklist that training details including data splits 'will be included in the Appendix,' but these details are not present in the provided main text.
Hardware Specification No The paper states in its checklist, 'Please see Appendix D' for compute and resource information, but no specific hardware details (like GPU or CPU models) are provided in the main text.
Software Dependencies No The paper mentions software like PyTorch and Adam optimizer in its references, but it does not provide a specific list of software dependencies with version numbers used for its experimental setup in the main text.
Experiment Setup No The paper states in its checklist that 'training details (e.g., data splits, hyperparameters, how they were chosen)' will be included in the Appendix, and refers to Appendix E for baseline hyperparameter tuning. However, no specific experimental setup details or hyperparameter values are provided in the main text.