Critic Regularized Regression

Authors: Ziyu Wang, Alexander Novikov, Konrad Zolna, Josh S. Merel, Jost Tobias Springenberg, Scott E. Reed, Bobak Shahriari, Noah Siegel, Caglar Gulcehre, Nicolas Heess, Nando de Freitas

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our algorithm, CRR, on a number of challenging simulated manipulation and locomotion domains. Our results demonstrate that CRR works well even in these challenging settings and that it outperforms previously published approaches, in some cases by a considerable margin.
Researcher Affiliation Industry Ziyu Wang ziyu@google.com Alexander Novikov anovikov@google.com Konrad Zołna kondiz@google.com Jost Tobias Springenberg springenberg@google.com Scott Reed reedscot@google.com Bobak Shahriari bshahr@google.com Noah Siegel siegeln@google.com Josh Merel jsmerel@google.com Caglar Gulcehre caglarg@google.com Nicolas Heess heess@google.com Nando de Freitas nando@google.com Deep Mind, London, United Kingdom. Google Brain, Toronto, Canada.
Pseudocode Yes Algorithm 1: Critic Regularized Regression
Open Source Code No The paper does not explicitly state that the code for CRR is open-source or provide a link to a repository for the methodology described.
Open Datasets Yes We experiment with the continuous control tasks introduced in RL Unplugged (RLU) [3]. There are 17 different tasks in RLU: nine tasks from the Deepmind Control suite [34] and seven locomotion tasks. We additionally introduce four robotic manipulation datasets.
Dataset Splits No The paper discusses training and evaluating models but does not provide specific dataset split percentages (e.g., train/validation/test) or sample counts for its experiments.
Hardware Specification No The paper states 'All simulations are conducted using Mu Jo Co [35]' but does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python 3.8, PyTorch 1.9, CUDA 11.1) to reproduce the experiment setup.
Experiment Setup Yes Full details on the hyper-parameters are given in the appendix.