reproducibilityindex.ai

RLx2: Training a Sparse Deep Reinforcement Learning Model from Scratch

Authors: Yiqin Tan, Pihe Hu, Ling Pan, Jiatai Huang, Longbo Huang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments, we demonstrate the state-of-the-art sparse training performance of RLx2 with two popular DRL algorithms, TD3 (Fujimoto et al., 2018) and SAC (Haarnoja et al., 2018), on several Mu Jo Co (Todorov et al., 2012) continuous control tasks.
Researcher Affiliation	Academia	Yiqin Tan , Pihe Hu , Ling Pan, Jiatai Huang, Longbo Huang Institute for Interdisciplinary Institute for Interdisciplinary Information Sciences Tsinghua University, Beijing, China {tyq22, hph19}@mails.tsinghua.edu.cn, longbohuang@tsinghua.edu.cn
Pseudocode	Yes	The pseudo-code of our scheme is given in Algorithm 1, where is the element-wise multiplication operator and Mθ is the binary mask to represent the sparse topology of the network θ. (...) Algorithm 1 Topology Evolution (Evci et al., 2020)
Open Source Code	Yes	The code is available at https://github.com/tyq1024/RLx2.
Open Datasets	Yes	Our experiments are conducted in four popular Mu Jo Co environments: Half Cheetah-v3 (Hal.), Hopper-v3 (Hop.), Walker2d-v3 (Wal.), and Ant-v3 (Ant.),3 for RLx2 with two off-policy algorithms, TD3 and SAC.
Dataset Splits	No	The paper describes training steps and evaluations based on average reward per episode, but it does not specify explicit training/validation/test dataset splits (e.g., percentages or counts) as commonly seen in supervised learning.
Hardware Specification	Yes	Our experiments are implemented with Py Torch (Paszke et al., 2017) and run on 8x P100 GPUs.
Software Dependencies	No	The paper mentions 'Py Torch (Paszke et al., 2017)' but does not provide a specific version number for PyTorch or other key software dependencies.
Experiment Setup	Yes	Table 4 presents detailed hyperparameters of RLx2-TD3 and RLx2-SAC in our experiments.