reproducibilityindex.ai

Stochastic Variance Reduction Methods for Policy Evaluation

Authors: Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments on benchmark problems demonstrate the effectiveness of our methods.
Researcher Affiliation	Collaboration	1Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA. 2Microsoft Research, Redmond, Washington 98052, USA.
Pseudocode	Yes	Algorithm 1 PDBG for Policy Evaluation
Open Source Code	No	The paper does not provide an explicit statement or link to the open-source code for the methodology described.
Open Datasets	Yes	In the ﬁrst task, we consider a randomly generated MDP with 400 states and 10 actions (Dann et al., 2014). ... Next, we test these algorithms on Mountain Car (Sutton & Barto, 1998, Chapter 8).
Dataset Splits	No	The paper mentions using a 'fixed, finite dataset' but does not provide specific details on training, validation, or test dataset splits or percentages.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies	No	The paper does not specify software dependencies with version numbers.
Experiment Setup	Yes	For step size tuning, σ is chosen from 4e-1, 1e-2, ..., 1e-6 / (1 + lambda_max(C_hat)) and σw is chosen from 4e-1, 1e-1, 1e-2 / lambda_max(C_hat). ... for SVRG we choose N = 2n. ... We chose γ = 0.95.