SVRG for Policy Evaluation with Fewer Gradient Evaluations

Authors: Zilun Peng, Ahmed Touati, Pascal Vincent, Doina Precup

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate large computational savings provided by the proposed methods. ... Our proposed methods, batching SVRG and SCSG2, are evaluated with LSTD, SVRG, SAGA and GTD2 on 4 tasks: Random MDP [Dann et al., 2014], Mountain Car-v0, Cart Pole-v1 and Acrobot-v1 [Brockman et al., 2016]. Detailed experiment setups are in [Peng et al., 2019]. Figure 1 shows policy evaluation results of SVRG and batching SVRG in different environments, and table 2 shows computational costs of SVRG and batching SVRG. Table 3 shows control performances of all methods.
Researcher Affiliation Collaboration 1Mila, Mc Gill University 2Mila, Université de Montréal 3Facebook AI Research
Pseudocode Yes Algorithm 1 Batching SVRG and SCSG for policy evaluation.
Open Source Code Yes Code of our experiments can be found at: https://github.com/zilunpeng/svrg_for_policy_evaluation_with_fewer_gradients
Open Datasets Yes Our proposed methods, batching SVRG and SCSG2, are evaluated with LSTD, SVRG, SAGA and GTD2 on 4 tasks: Random MDP [Dann et al., 2014], Mountain Car-v0, Cart Pole-v1 and Acrobot-v1 [Brockman et al., 2016].
Dataset Splits No The paper mentions generating data or using datasets of certain sizes (e.g., "1 million data samples", "20000 data samples") but does not explicitly specify training, validation, or test splits. It refers to "Detailed experiment setups are in [Peng et al., 2019]" for setup details, which is an external reference.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies. While a GitHub link is provided, the paper text itself does not list them.
Experiment Setup No The paper states, "Detailed experiment setups are in [Peng et al., 2019]", deferring these specifics to an external publication. It does not provide explicit hyperparameters or training configurations within the main text.