On the Data-Efficiency with Contrastive Image Transformation in Reinforcement Learning

Authors: Sicong Liu, Xi Sheryl Zhang, Yushuo Li, Yifan Zhang, Jian Cheng

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our approach on Deep Mind Control Suite and Atari100K. Empirical results verify advances using Co IT, enabling it to outperform the new state-of-the-art on various tasks. Source code is available at https://github.com/mooric Anna/Co IT. In this section, we benchmark our method on the Deep Mind control suite and Atari100K. We compare Co IT with prior model-free methods first, then we present ablation studies to show the details of our method. Implementation details can be found in Appendix C.
Researcher Affiliation Academia Sicong Liu1 2 3 Xi Sheryl Zhang2 3 5 Yushuo Li2 Yifan Zhang2 3 5 Jian Cheng2 3 4 1NJUST, 2Institute of Automation, Chinese Academy of Sciences, 3AIRIA, 4School of Future Technology, University of Chinese Academy of Sciences, 5University of Chinese Academy of Sciences, Nanjing
Pseudocode Yes In Algorithm 1, it defines the MDP M with Gaussian random variables G0 G|O| for initialization. The entire algorithm is presented in Algorithm 2 in Appendix B.
Open Source Code Yes Source code is available at https://github.com/mooric Anna/Co IT.
Open Datasets Yes We evaluate our approach on Deep Mind Control Suite and Atari100K. DMControl. Deep Mind control suite (Tassa et al., 2018) is a widely used benchmark... Atari100K. There have been a number of prior papers that have benchmarked data-efficiency on the Atari 2600 Games...
Dataset Splits No The paper mentions training and evaluation on benchmarks but does not provide specific details on how the datasets were split into training, validation, and test sets (e.g., percentages or sample counts for each).
Hardware Specification No The paper does not specify the exact hardware (e.g., GPU model, CPU, memory) used for running the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python version, library versions like PyTorch 1.x) that are needed to replicate the experiments.
Experiment Setup No The paper states 'Implementation details can be found in Appendix C.' but does not provide specific hyperparameter values or detailed training configurations in the main text.