CURL: Contrastive Unsupervised Representations for Reinforcement Learning
Authors: Michael Laskin, Aravind Srinivas, Pieter Abbeel
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | CURL outperforms prior pixel-based methods, both model-based and model-free, on complex tasks in the Deep Mind Control Suite and Atari Games showing 1.9x and 1.2x performance gains at the 100K environment and interaction steps benchmarks respectively. |
| Researcher Affiliation | Academia | Michael Laskin 1 Aravind Srinivas 1 Pieter Abbeel 1 1University of California, Berkeley, BAIR. Correspondence to: Michael Laskin, Aravind Srinivas, <mlaskin, aravind srinivas@berkeley.edu>. |
| Pseudocode | Yes | 4.7. CURL Contrastive Learning Pseudocode (Py Torch-like) |
| Open Source Code | Yes | Our code is open-sourced and available at https://www. github.com/Misha Laskin/curl. |
| Open Datasets | Yes | We benchmark for sample-efficiency on the DMControl suite (Tassa et al., 2018) and Atari Games benchmarks (Bellemare et al., 2013). |
| Dataset Splits | No | The paper describes training and evaluation within reinforcement learning environments (DMControl, Atari Games) where data is collected dynamically via agent interaction and stored in a replay buffer. It does not provide explicit training/validation/test dataset splits with percentages or counts as would be found in supervised learning. |
| Hardware Specification | No | The paper mentions receiving 'Google TFRC for cloud credits' in the acknowledgements, but does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch-like' in the pseudocode section but does not provide specific version numbers for PyTorch or any other software dependencies, such as libraries or environments, used in the experiments. |
| Experiment Setup | Yes | The paper provides specific experimental setup details such as the momentum parameter for target encoding ('m: momentum, e.g. 0.95'), and details on data augmentation ('Our aspect ratio for cropping is 0.84, i.e, we crop a 84 84 image from a 100 100 simulation-rendered image.'). |