reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On the Data-Efficiency with Contrastive Image Transformation in Reinforcement Learning

Authors: Sicong Liu, Xi Sheryl Zhang, Yushuo Li, Yifan Zhang, Jian Cheng

ICLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach on Deep Mind Control Suite and Atari100K. Empirical results verify advances using Co IT, enabling it to outperform the new state-of-the-art on various tasks. Source code is available at https://github.com/mooric Anna/Co IT. In this section, we benchmark our method on the Deep Mind control suite and Atari100K. We compare Co IT with prior model-free methods first, then we present ablation studies to show the details of our method. Implementation details can be found in Appendix C.
Researcher Affiliation	Academia	Sicong Liu1 2 3 Xi Sheryl Zhang2 3 5 Yushuo Li2 Yifan Zhang2 3 5 Jian Cheng2 3 4 1NJUST, 2Institute of Automation, Chinese Academy of Sciences, 3AIRIA, 4School of Future Technology, University of Chinese Academy of Sciences, 5University of Chinese Academy of Sciences, Nanjing
Pseudocode	Yes	In Algorithm 1, it defines the MDP M with Gaussian random variables G0 G\|O\| for initialization. The entire algorithm is presented in Algorithm 2 in Appendix B.
Open Source Code	Yes	Source code is available at https://github.com/mooric Anna/Co IT.
Open Datasets	Yes	We evaluate our approach on Deep Mind Control Suite and Atari100K. DMControl. Deep Mind control suite (Tassa et al., 2018) is a widely used benchmark... Atari100K. There have been a number of prior papers that have benchmarked data-efficiency on the Atari 2600 Games...
Dataset Splits	No	The paper mentions training and evaluation on benchmarks but does not provide specific details on how the datasets were split into training, validation, and test sets (e.g., percentages or sample counts for each).
Hardware Specification	No	The paper does not specify the exact hardware (e.g., GPU model, CPU, memory) used for running the experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch 1.x) that are needed to replicate the experiments.
Experiment Setup	No	The paper states 'Implementation details can be found in Appendix C.' but does not provide specific hyperparameter values or detailed training configurations in the main text.