IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks
Authors: Michael Luo, Jiahao Yao, Richard Liaw, Eric Liang, Ion Stoica
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In discrete action-space environments, we show that IMPACT attains higher reward and, simultaneously, achieves up to 30% decrease in training wall-time than that of IMPALA. For continuous control environments, IMPACT trains faster than existing scalable agents while preserving the sample efficiency of synchronous PPO. |
| Researcher Affiliation | Academia | Michael Luo UC Berkeley michael.luo@berkeley.edu Jiahao Yao UC Berkeley jiahaoyao@berkeley.edu Richard Liaw UC Berkeley Eric Liang UC Berkeley Ion Stoica UC Berkeley |
| Pseudocode | Yes | Algorithm 1 IMPACT |
| Open Source Code | No | The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We tested the agent on three continuous environments (Figure 5): Half Cheetah, Hopper, and Humanoid on 16 CPUs and 1 GPU. For the discrete environments (Figure 6), Pong, Space Invaders, and Breakout were chosen as common benchmarks used in popular distributed RL libraries (Caspi et al., 2017; Liang et al., 2018). Additional experiments for discrete environments are in the Appendix. These experiments were ran on 32 CPUs and 1 GPU. |
| Dataset Splits | No | The paper mentions 'evaluation rollouts' but does not explicitly provide specific details on how the dataset was split into training, validation, and test portions (e.g., percentages, sample counts, or citations to predefined splits). |
| Hardware Specification | No | We tested the agent on three continuous environments... on 16 CPUs and 1 GPU. For the discrete environments... ran on 32 CPUs and 1 GPU. This only specifies the count of CPUs and GPUs, not their specific models (e.g., 'NVIDIA A100', 'Intel Xeon'), which is required for detailed hardware specification. |
| Software Dependencies | No | The paper mentions environments like 'OpenAI Gym', 'MuJoCo', and 'Atari environments', and implicitly uses deep learning frameworks. However, it does not specify any software components with their version numbers (e.g., 'Python 3.x', 'PyTorch 1.x'). |
| Experiment Setup | Yes | The hyper-parameters for continuous and discrete environments are listed in the Appendix B table 1 and 2 respectively. Table 1: Hyperparameters for Discrete Environments. Table 2: Hyperparameters for Continuous Control Environments. |