Accelerating Stochastic Composition Optimization
Authors: Mengdi Wang, Ji Liu, Ethan Fang
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We further demonstrate the application of ASC-PG to reinforcement learning and conduct numerical experiments. We consider three experiments, where in the first two experiments, we compare our proposed accelerated ASC-PG algorithm with SCGD algorithm [Wang et al., 2016] and the recently proposed GTD2-MP algorithm [Liu et al., 2015]. |
| Researcher Affiliation | Academia | Mengdi Wang , Ji Liu , and Ethan X. Fang Princeton University, University of Rochester, Pennsylvania State University |
| Pseudocode | Yes | Algorithm 1 Accelerated Stochastic Compositional Proximal Gradient (ASC-PG) |
| Open Source Code | No | The paper does not provide any explicit statements about open-sourcing its code or provide a link to a code repository. |
| Open Datasets | Yes | Experiment 1: We use the Baird s example [Baird et al., 1995], which is a well-known example to test the off-policy convergent algorithms. ... Experiment 2: We generate a Markov decision problem (MDP) using similar setup as in White and White [2016]. |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits. It discusses empirical convergence rates and averaging results over multiple runs but does not define validation sets. |
| Hardware Specification | No | The paper does not mention any specific hardware used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software names with version numbers that would be necessary to replicate the experiments. |
| Experiment Setup | Yes | In all cases, we choose the step sizes via comparison studies as in Dann et al. [2014]: ... Experiment 3: ... We add an 1-regularization term, λkwk1, to the objective function. ... lambda = 1e-3 lambda = 5e-4 |