Reinforcement Learning with Prototypical Representations
Authors: Denis Yarats, Rob Fergus, Alessandro Lazaric, Lerrel Pinto
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we discuss empirical results on using Proto RL for learning visual representations. (Section 5) and Proto-RL significantly improves upon Random exploration and APT across all environments, while being better than Curiosity based exploration in 7/8 environments. (Section 5.2) |
| Researcher Affiliation | Collaboration | 1New York University 2Facebook AI Research. |
| Pseudocode | Yes | The pseudo-code for our framework is provided in Appendix C. (Section 4.1) and Algorithm 1: Proto-RL (Appendix C). |
| Open Source Code | Yes | We open-source our code at https: //github.com/denisyarats/proto. |
| Open Datasets | Yes | We use the Deep Mind Control Suite (Tassa et al., 2018), a challenging benchmark for image-based RL. |
| Dataset Splits | No | The paper describes environment interactions and evaluation episodes, but it does not specify explicit training/validation/test dataset splits like percentages or sample counts for a static dataset. In RL, data is dynamically generated. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models (e.g., NVIDIA V100), CPU models, or cloud computing instance types used for the experiments. |
| Software Dependencies | Yes | Proto-RL is trained using Adam (Kingma & Ba, 2014)... We use SAC implementation from Yarats & Kostrikov (2020). |
| Experiment Setup | Yes | Hyper-parameters Proto-RL is trained using Adam (Kingma & Ba, 2014) with learning rate 10 4 and mini-batch size of 512. The downstream exploration hyper-parameter is = 0.2 and the number of cluster candidates is set to T = 4. We use SAC implementation from Yarats & Kostrikov (2020). |