Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Accelerating Stochastic Composition Optimization

Authors: Mengdi Wang, Ji Liu, Ethan Fang

NeurIPS 2016 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We further demonstrate the application of ASC-PG to reinforcement learning and conduct numerical experiments. We consider three experiments, where in the ๏ฌrst two experiments, we compare our proposed accelerated ASC-PG algorithm with SCGD algorithm [Wang et al., 2016] and the recently proposed GTD2-MP algorithm [Liu et al., 2015].
Researcher Affiliation Academia Mengdi Wang , Ji Liu , and Ethan X. Fang Princeton University, University of Rochester, Pennsylvania State University
Pseudocode Yes Algorithm 1 Accelerated Stochastic Compositional Proximal Gradient (ASC-PG)
Open Source Code No The paper does not provide any explicit statements about open-sourcing its code or provide a link to a code repository.
Open Datasets Yes Experiment 1: We use the Baird s example [Baird et al., 1995], which is a well-known example to test the off-policy convergent algorithms. ... Experiment 2: We generate a Markov decision problem (MDP) using similar setup as in White and White [2016].
Dataset Splits No The paper does not explicitly provide training/validation/test dataset splits. It discusses empirical convergence rates and averaging results over multiple runs but does not define validation sets.
Hardware Specification No The paper does not mention any specific hardware used for running the experiments.
Software Dependencies No The paper does not specify any software names with version numbers that would be necessary to replicate the experiments.
Experiment Setup Yes In all cases, we choose the step sizes via comparison studies as in Dann et al. [2014]: ... Experiment 3: ... We add an 1-regularization term, ฮปkwk1, to the objective function. ... lambda = 1e-3 lambda = 5e-4