DAC: The Double Actor-Critic Architecture for Learning Options
Authors: Shangtong Zhang, Shimon Whiteson
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct an empirical study on challenging robot simulation tasks. |
| Researcher Affiliation | Academia | Shangtong Zhang, Shimon Whiteson Department of Computer Science University of Oxford {shangtong.zhang, shimon.whiteson}@cs.ox.ac.uk |
| Pseudocode | Yes | The pseudocode of DAC is provided in the supplementary materials. |
| Open Source Code | Yes | All implementations are made publicly available 1. https://github.com/ShangtongZhang/DeepRL |
| Open Datasets | Yes | We consider four robot simulation tasks used by Smith et al. (2018) from Open AI gym (Brockman et al., 2016). ... We use 6 pairs of tasks from Deep Mind Control Suite (DMControl, Tassa et al. 2018) |
| Dataset Splits | No | The paper does not provide specific training/validation/test dataset splits. It uses reinforcement learning environments (OpenAI Gym, DeepMind Control Suite) where data is generated through interaction. |
| Hardware Specification | No | The Acknowledgments section mentions 'a generous equipment grant from NVIDIA' but does not specify the exact GPU models, CPU models, or other hardware specifications used for experiments. |
| Software Dependencies | No | The paper mentions using PPO, A2C, OpenAI Gym, and DeepMind Control Suite, but does not provide specific version numbers for these software dependencies or any other libraries. |
| Experiment Setup | Yes | Our PPO implementation uses the same architecture and hyperparameters reported by Schulman et al. (2017). ... We use 4 options for all algorithms, following Smith et al. (2018). We report the online training episode return, smoothed by a sliding window of size 20. |