CCLF: A Contrastive-Curiosity-Driven Learning Framework for Sample-Efficient Reinforcement Learning

Authors: Chenyu Sun, Hangwei Qian, Chunyan Miao

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply CCLF to several base RL algorithms and evaluate on the Deep Mind Control Suite, Atari, and Mini Grid benchmarks, where our approach demonstrates superior sample efficiency and learning performances compared with other state-of-the-art methods.
Researcher Affiliation Collaboration Chenyu Sun1,2,3 , Hangwei Qian2,4 and Chunyan Miao1,2,4 1Alibaba-NTU Singapore Joint Research Institute 2School of Computer Science and Engineering, Nanyang Technological University 3Alibaba Group 4Joint NTU-UBC Research Centre of Excellence in Active Living for the Elderly (LILY) {chenyu002, qian0045}@e.ntu.edu.sg, ascymiao@ntu.edu.sg
Pseudocode Yes Algorithm 1 An Implementation of CCLF on SAC
Open Source Code Yes Our code is available at https://github.com/csun001/CCLF.
Open Datasets Yes We empirically evaluate the proposed CCLF in terms of sample efficiency and ultimate performance, on 6 continuous control tasks from the DMC suite [Tunyasuvunakool et al., 2020], 26 discrete control tasks from the Atari Games [Bellemare et al., 2013] and 3 navigation tasks with sparse extrinsic rewards from the Mini Grid [Chevalier-Boisvert et al., 2018].
Dataset Splits No The paper discusses training using environment interactions, experience replay, and sampling minibatches, but it does not specify explicit training, validation, or test dataset splits in terms of percentages or counts for reproducing data partitioning.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions various algorithms and frameworks (e.g., SAC, CURL, Dr Q, Rainbow DQN, A2C, RE3) but does not provide specific software dependencies with version numbers (e.g., Python version, PyTorch version, TensorFlow version, or library versions).
Experiment Setup Yes The detailed setting of hyper-parameters is provided in Appendix B.1. For our proposed CCLF, we initialize it with [K, M] = [5, 5] to generate a sufficiently large amount of augmented inputs. For simplicity, we fix i randomly and only select j via Eq. (5) for the augmented input selection.