reproducibilityindex.ai

Solving Continuous Control via Q-learning

Authors: Tim Seyde, Peter Werner, Wilko Schwarting, Igor Gilitschenski, Martin Riedmiller, Daniela Rus, Markus Wulfmeier

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that a simple modiﬁcation of deep Q-learning largely alleviates these issues. By combining bang-bang action discretization with value decomposition, framing singleagent control as cooperative multi-agent reinforcement learning (MARL), this simple critic-only approach matches performance of state-of-the-art continuous actor-critic methods when learning from features or pixels. [...] We evaluate performance of the Dec QN agent on several continuous control environments from the Deep Mind Control Suite (Tunyasuvunakool et al., 2020) and Meta World (Yu et al., 2020).
Researcher Affiliation	Collaboration	Tim Seyde MIT CSAIL Peter Werner MIT CSAIL Wilko Schwarting MIT CSAIL Igor Gilitschenski University of Toronto Martin Riedmiller Deep Mind Daniela Rus1 MIT CSAIL Markus Wulfmeier1 Deep Mind
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	No	The paper states that the Acme framework and baseline agents (Dreamer-v2, Dr Q-v2) are open source, but does not provide a direct link or explicit statement that their specific implementation of Dec QN is open source within the paper's text. The project website linked in a footnote does provide the code, but the paper itself does not.
Open Datasets	Yes	We evaluate performance of the Dec QN agent on several continuous control environments from the Deep Mind Control Suite (Tunyasuvunakool et al., 2020) and Meta World (Yu et al., 2020)
Dataset Splits	No	The paper describes using multiple seeds for experiments and varying hyperparameters, but does not specify explicit train/validation/test dataset splits with percentages or sample counts for the continuous control tasks. In reinforcement learning, the environment often serves as the training and testing ground, but traditional dataset split information is not provided.
Hardware Specification	Yes	Experiments on Control Suite and Matrix Game tasks were conducted on a single NVIDIA V100 GPU with 4 CPU cores (state-based) or 20 CPU cores (pixel-based). Experiments in Meta World and Isaac Gym were conducted on a single NVIDIA 2080Ti with 4 CPU cores.
Software Dependencies	No	The paper mentions implementing Dec QN within the Acme framework in TensorFlow and a PyTorch version for the Mini Cheetah task, but does not provide specific version numbers for TensorFlow, PyTorch, or Acme.
Experiment Setup	Yes	We provide hyperparameter values of Dec QN used for benchmarking in Table 2. A constant set of hyperparameters is used throughout all experiments, with modiﬁcations to the network architecture for vision-based tasks. [...] Table 2: Dec QN hyperparameters for stateand pixel-based control. (Includes Learning rate, Batch size, Discount γ, etc.)