Critic Regularized Regression
Authors: Ziyu Wang, Alexander Novikov, Konrad Zolna, Josh S. Merel, Jost Tobias Springenberg, Scott E. Reed, Bobak Shahriari, Noah Siegel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our algorithm, CRR, on a number of challenging simulated manipulation and locomotion domains. Our results demonstrate that CRR works well even in these challenging settings and that it outperforms previously published approaches, in some cases by a considerable margin. |
| Researcher Affiliation | Industry | Ziyu Wang ziyu@google.com Alexander Novikov anovikov@google.com Konrad Zołna kondiz@google.com Jost Tobias Springenberg springenberg@google.com Scott Reed reedscot@google.com Bobak Shahriari bshahr@google.com Noah Siegel siegeln@google.com Josh Merel jsmerel@google.com Caglar Gulcehre caglarg@google.com Nicolas Heess heess@google.com Nando de Freitas nando@google.com Deep Mind, London, United Kingdom. Google Brain, Toronto, Canada. |
| Pseudocode | Yes | Algorithm 1: Critic Regularized Regression |
| Open Source Code | No | The paper does not explicitly state that the code for CRR is open-source or provide a link to a repository for the methodology described. |
| Open Datasets | Yes | We experiment with the continuous control tasks introduced in RL Unplugged (RLU) [3]. There are 17 different tasks in RLU: nine tasks from the Deepmind Control suite [34] and seven locomotion tasks. We additionally introduce four robotic manipulation datasets. |
| Dataset Splits | No | The paper discusses training and evaluating models but does not provide specific dataset split percentages (e.g., train/validation/test) or sample counts for its experiments. |
| Hardware Specification | No | The paper states 'All simulations are conducted using Mu Jo Co [35]' but does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1) to reproduce the experiment setup. |
| Experiment Setup | Yes | Full details on the hyper-parameters are given in the appendix. |