reproducibilityindex.ai

Continuous Control with Action Quantization from Demonstrations

Authors: Robert Dadashi, Léonard Hussenot, Damien Vincent, Sertan Girgin, Anton Raichuk, Matthieu Geist, Olivier Pietquin

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically evaluate this discretization strategy on three downstream task setups: Reinforcement Learning with demonstrations, Reinforcement Learning with play data (demonstrations of a human playing in an environment but not solving any speciﬁc task), and Imitation Learning.
Researcher Affiliation	Collaboration	1Google Research, Brain Team 2Univ. de Lille, CNRS, Inria Scool, UMR 9189 CRISt AL.
Pseudocode	No	The paper does not contain any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	and make the code available: https://github.com/ google-research/google-research/ tree/master/aquadem.
Open Datasets	Yes	We consider the Adroit tasks (Rajeswaran et al., 2017) represented in Figure 11, for which human demonstrations are available (25 episodes acquired using a virtual reality system). ... We consider the Robodesk tasks (Kannan et al., 2021) shown in Figure 11, for which we acquired play data. ... We evaluate the resulting algorithm on the D4RL locomotion tasks and provide performance against state-of-the-art ofﬂine RL algorithms.
Dataset Splits	No	The paper describes training on environment interactions and evaluating on episodes, but it does not specify explicit dataset splits (e.g., percentages or counts) specifically designated for 'validation' purposes from a fixed dataset.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions that the 'implementation is from Acme (Hoffman et al., 2020)' but does not provide specific version numbers for Acme or other key software dependencies like PyTorch or TensorFlow.
Experiment Setup	Yes	For all experiments, we detail the networks architectures, hyperparameters search, and training procedures in the Appendix and we provide videos of all the agents trained in the website. ... Table 3. Hyperparameter sweep for the AQua DQN agent.