Accelerating Stochastic Composition Optimization

Authors: Mengdi Wang, Ji Liu, Ethan Fang

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We further demonstrate the application of ASC-PG to reinforcement learning and conduct numerical experiments. We consider three experiments, where in the first two experiments, we compare our proposed accelerated ASC-PG algorithm with SCGD algorithm [Wang et al., 2016] and the recently proposed GTD2-MP algorithm [Liu et al., 2015].
Researcher Affiliation Academia Mengdi Wang , Ji Liu , and Ethan X. Fang Princeton University, University of Rochester, Pennsylvania State University
Pseudocode Yes Algorithm 1 Accelerated Stochastic Compositional Proximal Gradient (ASC-PG)
Open Source Code No The paper does not provide any explicit statements about open-sourcing its code or provide a link to a code repository.
Open Datasets Yes Experiment 1: We use the Baird s example [Baird et al., 1995], which is a well-known example to test the off-policy convergent algorithms. ... Experiment 2: We generate a Markov decision problem (MDP) using similar setup as in White and White [2016].
Dataset Splits No The paper does not explicitly provide training/validation/test dataset splits. It discusses empirical convergence rates and averaging results over multiple runs but does not define validation sets.
Hardware Specification No The paper does not mention any specific hardware used for running the experiments.
Software Dependencies No The paper does not specify any software names with version numbers that would be necessary to replicate the experiments.
Experiment Setup Yes In all cases, we choose the step sizes via comparison studies as in Dann et al. [2014]: ... Experiment 3: ... We add an 1-regularization term, λkwk1, to the objective function. ... lambda = 1e-3 lambda = 5e-4