reproducibilityindex.ai

SAPG: Split and Aggregate Policy Gradients

Authors: Jayesh Singla, Ananye Agarwal, Deepak Pathak

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Experimental Setup; 6. Results and Analysis; Figure 5. Performance curves of SAPG with respect to PPO, PBT and PQL baselines.
Researcher Affiliation	Academia	1Carnegie Mellon University.
Pseudocode	Yes	Algorithm 1 SAPG
Open Source Code	Yes	Webpage at https://sapg-rl.github.io.
Open Datasets	Yes	We conduct experiments on 5 manipulation tasks (3 hard and 2 easy) and compare them against SOTA methods for the large-scale parallelized setting. We use a GPU-accelerated simulator, Isaac Gym (Makoviychuk et al., 2021); For testing, we choose a suite of manipulation environments that are challenging and require large-scale data to learn effective policies (Petrenko et al., 2023).
Dataset Splits	No	The paper does not explicitly provide details about training, validation, and test dataset splits with percentages or sample counts. In reinforcement learning, data is typically generated through interaction, and while they specify test environments, they don’t mention a validation split for the collected experience.
Hardware Specification	No	The paper mentions running experiments on a “single GPU” and using “GPU-accelerated simulators” but does not provide specific hardware models (e.g., specific GPU model, CPU model, or memory size).
Software Dependencies	No	The paper mentions various software components and frameworks like PPO, Isaac Gym, Phys X, Mujoco-3.0, and ELU activation, but it does not provide specific version numbers for these or other ancillary software dependencies used in their implementation.
Experiment Setup	Yes	5. Experimental Setup; B. Training hyperparameters; Table 2. Training hyperparameters for Allegro Kuka tasks; Table 3. Training hyperparameters for Shadow Hand; Table 4. Training hyperparameters for Shadow Hand