SVRG for Policy Evaluation with Fewer Gradient Evaluations
Authors: Zilun Peng, Ahmed Touati, Pascal Vincent, Doina Precup
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate large computational savings provided by the proposed methods. ... Our proposed methods, batching SVRG and SCSG2, are evaluated with LSTD, SVRG, SAGA and GTD2 on 4 tasks: Random MDP [Dann et al., 2014], Mountain Car-v0, Cart Pole-v1 and Acrobot-v1 [Brockman et al., 2016]. Detailed experiment setups are in [Peng et al., 2019]. Figure 1 shows policy evaluation results of SVRG and batching SVRG in different environments, and table 2 shows computational costs of SVRG and batching SVRG. Table 3 shows control performances of all methods. |
| Researcher Affiliation | Collaboration | 1Mila, Mc Gill University 2Mila, Université de Montréal 3Facebook AI Research |
| Pseudocode | Yes | Algorithm 1 Batching SVRG and SCSG for policy evaluation. |
| Open Source Code | Yes | Code of our experiments can be found at: https://github.com/zilunpeng/svrg_for_policy_evaluation_with_fewer_gradients |
| Open Datasets | Yes | Our proposed methods, batching SVRG and SCSG2, are evaluated with LSTD, SVRG, SAGA and GTD2 on 4 tasks: Random MDP [Dann et al., 2014], Mountain Car-v0, Cart Pole-v1 and Acrobot-v1 [Brockman et al., 2016]. |
| Dataset Splits | No | The paper mentions generating data or using datasets of certain sizes (e.g., "1 million data samples", "20000 data samples") but does not explicitly specify training, validation, or test splits. It refers to "Detailed experiment setups are in [Peng et al., 2019]" for setup details, which is an external reference. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies. While a GitHub link is provided, the paper text itself does not list them. |
| Experiment Setup | No | The paper states, "Detailed experiment setups are in [Peng et al., 2019]", deferring these specifics to an external publication. It does not provide explicit hyperparameters or training configurations within the main text. |