reproducibilityindex.ai

Sample Efficient Policy Gradient Methods with Recursive Variance Reduction

Authors: Pan Xu, Felicia Gao, Quanquan Gu

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct numerical experiments on classic control problems in reinforcement learning to validate the performance of our proposed algorithms. Our experimental results on classical control tasks in reinforcement learning demonstrate the superior performance of the proposed SRVR-PG and SRVR-PG-PE algorithms and verify our theoretical analysis.
Researcher Affiliation	Academia	Pan Xu, Felicia Gao, Quanquan Gu Department of Computer Science University of California, Los Angeles Los Angeles, CA 90094, USA panxu@cs.ucla.edu, fxgao1160@engineering.ucla.edu, qgu@cs.ucla.edu
Pseudocode	Yes	Algorithm 1 Stochastic Recursive Variance Reduced Policy Gradient (SRVR-PG) and Algorithm 2 Stochastic Recursive Variance Reduced Policy Gradient with Parameter-based Exploration (SRVR-PG-PE)
Open Source Code	No	The paper does not provide any concrete statement or link indicating that the source code for the proposed SRVR-PG or SRVR-PG-PE methodology is publicly available.
Open Datasets	Yes	We provide experiment results of the proposed algorithm on benchmark reinforcement learning environments including the Cartpole, Mountain Car and Pendulum problems.
Dataset Splits	No	The paper mentions using 'benchmark reinforcement learning environments' but does not provide specific details on train/validation/test dataset splits, as these are simulated environments where policies are learned through interaction rather than traditional dataset splits.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU models, CPU types, memory) used to run the experiments.
Software Dependencies	No	The paper mentions using a 'Gaussian policy' and refers to 'grid search' for tuning parameters, but it does not specify any software dependencies with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup	Yes	The detailed parameters used in the experiments are presented in Appendix E. Table 2: Parameters used in the SRVR-PG experiments. ... Table 3: Parameters used in the SRVR-PG-PE experiments. These tables include specific hyperparameters such as 'NN size', 'Task horizon', 'Discount factor γ', 'Learning rate η', 'Batch size N', 'Batch size B', and 'Epoch size m'.