Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
PPG Reloaded: An Empirical Study on What Matters in Phasic Policy Gradient
Authors: Kaixin Wang, Daquan Zhou, Jiashi Feng, Shie Mannor
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | However, through an extensive empirical study, we unveil that policy regularization and data diversity are what actually matters. (...) To get a clear understanding of what matters in PPG, we conduct a large-scale empirical study on Procgen. Specifically, we focus on the three aspects in Table 1 and run ablation experiments with proper control of other (possibly confounding) hyperparameters. Our study consists of comprehensive experiments covering all 16 Procgen games in both sample efficiency and generalization setups. |
| Researcher Affiliation | Collaboration | Kaixin Wang 1 Daquan Zhou 2 Jiashi Feng 2 Shie Mannor 1 3 (...) 1Faculty of Electrical And Computer Engineering, Technion, Haifa, Israel 2Byte Dance, Singapore 3NVIDIA Research, Haifa, Israel. |
| Pseudocode | Yes | Algorithm 1 Single-network PPG framework |
| Open Source Code | No | The paper does not contain any statement about making the source code for their methodology publicly available or provide a link to a code repository. |
| Open Datasets | Yes | The large-scale Procgen benchmark (Cobbe et al., 2020) is used as the testbed of our study. |
| Dataset Splits | No | The paper describes training and testing setups for Procgen but does not explicitly provide training/validation/test dataset splits or percentages for reproduction. |
| Hardware Specification | Yes | All experiments are conducted using Intel Xeon Platinum 8260 CPU and NVIDIA V100 GPU. |
| Software Dependencies | No | The paper mentions using 'Py Torch (Paszke et al., 2019) as a deep learning framework' but does not specify a version number for PyTorch or other software dependencies. |
| Experiment Setup | Yes | A. Training Details For all experiments, we use the residual convolutional networks in IMPALA (Espeholt et al., 2018) and the Adam optimizer (Kingma & Ba, 2015), following previous practices (Cobbe et al., 2020; 2021). We use Py Torch (Paszke et al., 2019) as a deep learning framework. All experiments are conducted using Intel Xeon Platinum 8260 CPU and NVIDIA V100 GPU. The table below lists the default values of the parameters in our experiments. |