reproducibilityindex.ai

IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks

Authors: Michael Luo, Jiahao Yao, Richard Liaw, Eric Liang, Ion Stoica

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In discrete action-space environments, we show that IMPACT attains higher reward and, simultaneously, achieves up to 30% decrease in training wall-time than that of IMPALA. For continuous control environments, IMPACT trains faster than existing scalable agents while preserving the sample efﬁciency of synchronous PPO.
Researcher Affiliation	Academia	Michael Luo UC Berkeley michael.luo@berkeley.edu Jiahao Yao UC Berkeley jiahaoyao@berkeley.edu Richard Liaw UC Berkeley Eric Liang UC Berkeley Ion Stoica UC Berkeley
Pseudocode	Yes	Algorithm 1 IMPACT
Open Source Code	No	The paper does not contain any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We tested the agent on three continuous environments (Figure 5): Half Cheetah, Hopper, and Humanoid on 16 CPUs and 1 GPU. For the discrete environments (Figure 6), Pong, Space Invaders, and Breakout were chosen as common benchmarks used in popular distributed RL libraries (Caspi et al., 2017; Liang et al., 2018). Additional experiments for discrete environments are in the Appendix. These experiments were ran on 32 CPUs and 1 GPU.
Dataset Splits	No	The paper mentions 'evaluation rollouts' but does not explicitly provide specific details on how the dataset was split into training, validation, and test portions (e.g., percentages, sample counts, or citations to predefined splits).
Hardware Specification	No	We tested the agent on three continuous environments... on 16 CPUs and 1 GPU. For the discrete environments... ran on 32 CPUs and 1 GPU. This only specifies the count of CPUs and GPUs, not their specific models (e.g., 'NVIDIA A100', 'Intel Xeon'), which is required for detailed hardware specification.
Software Dependencies	No	The paper mentions environments like 'OpenAI Gym', 'MuJoCo', and 'Atari environments', and implicitly uses deep learning frameworks. However, it does not specify any software components with their version numbers (e.g., 'Python 3.x', 'PyTorch 1.x').
Experiment Setup	Yes	The hyper-parameters for continuous and discrete environments are listed in the Appendix B table 1 and 2 respectively. Table 1: Hyperparameters for Discrete Environments. Table 2: Hyperparameters for Continuous Control Environments.