PPG Reloaded: An Empirical Study on What Matters in Phasic Policy Gradient

Authors: Kaixin Wang, Daquan Zhou, Jiashi Feng, Shie Mannor

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental However, through an extensive empirical study, we unveil that policy regularization and data diversity are what actually matters. (...) To get a clear understanding of what matters in PPG, we conduct a large-scale empirical study on Procgen. Specifically, we focus on the three aspects in Table 1 and run ablation experiments with proper control of other (possibly confounding) hyperparameters. Our study consists of comprehensive experiments covering all 16 Procgen games in both sample efficiency and generalization setups.
Researcher Affiliation Collaboration Kaixin Wang 1 Daquan Zhou 2 Jiashi Feng 2 Shie Mannor 1 3 (...) 1Faculty of Electrical And Computer Engineering, Technion, Haifa, Israel 2Byte Dance, Singapore 3NVIDIA Research, Haifa, Israel.
Pseudocode Yes Algorithm 1 Single-network PPG framework
Open Source Code No The paper does not contain any statement about making the source code for their methodology publicly available or provide a link to a code repository.
Open Datasets Yes The large-scale Procgen benchmark (Cobbe et al., 2020) is used as the testbed of our study.
Dataset Splits No The paper describes training and testing setups for Procgen but does not explicitly provide training/validation/test dataset splits or percentages for reproduction.
Hardware Specification Yes All experiments are conducted using Intel Xeon Platinum 8260 CPU and NVIDIA V100 GPU.
Software Dependencies No The paper mentions using 'Py Torch (Paszke et al., 2019) as a deep learning framework' but does not specify a version number for PyTorch or other software dependencies.
Experiment Setup Yes A. Training Details For all experiments, we use the residual convolutional networks in IMPALA (Espeholt et al., 2018) and the Adam optimizer (Kingma & Ba, 2015), following previous practices (Cobbe et al., 2020; 2021). We use Py Torch (Paszke et al., 2019) as a deep learning framework. All experiments are conducted using Intel Xeon Platinum 8260 CPU and NVIDIA V100 GPU. The table below lists the default values of the parameters in our experiments.