Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Revisiting Data Augmentation in Deep Reinforcement Learning
Authors: Jianshu Hu, Yunpeng Jiang, Paul Weng
ICLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 EXPERIMENTAL RESULTS In order to validate our theoretical analysis and show the effectiveness of our proposed algorithm, we perform a series of experiments to (1) experimentally validate our propositions, (2) conduct a case study explicitly showing the statistics we analyzed, (3) compare our final proposed algorithm with state-of-the-art baselines (RAD, Dr AC, Dr Q, Dr Qv2, SVEA) to verify its sample efficiency, and evaluate its generalization ability against SVEA, which was specifically-designed for this purpose. |
| Researcher Affiliation | Academia | Jianshu Hu, Yunpeng Jiang UM-SJTU Joint Institute Shanghai Jiao Tong University Shanghai, China EMAIL Paul Weng Data Science Research Center Duke Kunshan University Kunshan, Jiangsu, China EMAIL |
| Pseudocode | Yes | Algorithm 1 Data-Augmented Off-policy Actor-Critic Scheme |
| Open Source Code | Yes | 1The source code of our method: https://github.com/Jianshu-Hu/drqv2 |
| Open Datasets | Yes | We evaluate different methods on environments from Deep Mind Control Suite (Tassa et al., 2018) |
| Dataset Splits | No | The paper evaluates different methods on Deep Mind Control Suite environments, but does not specify explicit training/validation/test dataset splits with percentages, counts, or references to predefined splits. |
| Hardware Specification | Yes | to make the algorithms easier to run on our computing device, equipped with one NVIDIA RTX 3060 GPU and Intel i7-10700 CPU |
| Software Dependencies | No | The paper mentions software like PyTorch and specifies optimizers (Adam) but does not provide specific version numbers for these or other key software dependencies. |
| Experiment Setup | Yes | Table 6: Hyperparameters used in experiments on DMControl (drq) |