Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On the Data-Efficiency with Contrastive Image Transformation in Reinforcement Learning
Authors: Sicong Liu, Xi Sheryl Zhang, Yushuo Li, Yifan Zhang, Jian Cheng
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on Deep Mind Control Suite and Atari100K. Empirical results verify advances using Co IT, enabling it to outperform the new state-of-the-art on various tasks. Source code is available at https://github.com/mooric Anna/Co IT. In this section, we benchmark our method on the Deep Mind control suite and Atari100K. We compare Co IT with prior model-free methods first, then we present ablation studies to show the details of our method. Implementation details can be found in Appendix C. |
| Researcher Affiliation | Academia | Sicong Liu1 2 3 Xi Sheryl Zhang2 3 5 Yushuo Li2 Yifan Zhang2 3 5 Jian Cheng2 3 4 1NJUST, 2Institute of Automation, Chinese Academy of Sciences, 3AIRIA, 4School of Future Technology, University of Chinese Academy of Sciences, 5University of Chinese Academy of Sciences, Nanjing |
| Pseudocode | Yes | In Algorithm 1, it defines the MDP M with Gaussian random variables G0 G|O| for initialization. The entire algorithm is presented in Algorithm 2 in Appendix B. |
| Open Source Code | Yes | Source code is available at https://github.com/mooric Anna/Co IT. |
| Open Datasets | Yes | We evaluate our approach on Deep Mind Control Suite and Atari100K. DMControl. Deep Mind control suite (Tassa et al., 2018) is a widely used benchmark... Atari100K. There have been a number of prior papers that have benchmarked data-efficiency on the Atari 2600 Games... |
| Dataset Splits | No | The paper mentions training and evaluation on benchmarks but does not provide specific details on how the datasets were split into training, validation, and test sets (e.g., percentages or sample counts for each). |
| Hardware Specification | No | The paper does not specify the exact hardware (e.g., GPU model, CPU, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch 1.x) that are needed to replicate the experiments. |
| Experiment Setup | No | The paper states 'Implementation details can be found in Appendix C.' but does not provide specific hyperparameter values or detailed training configurations in the main text. |