Beyond Worst-Case Analysis in Stochastic Approximation: Moment Estimation Improves Instance Complexity

Authors: Jingzhao Zhang, Hongzhou Lin, Subhro Das, Suvrit Sra, Ali Jadbabaie

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test the noise variation and empirical performance of our moment estimation algorithm on policy optimization and neural network training. In this section, we present three sets of experiments: synthetic least squares, policy optimization for mujoco tasks and neural network training on Cifar10 dataset.
Researcher Affiliation Collaboration Jingzhao Zhang 1IIIS, Tsinghua University 2Amazon 3MIT-IBM Watson AI Lab, IBM Research 4Massachusetts Institute of Technology.
Pseudocode Yes Algorithm 1 Moment Estimation SGD (x1, T, c, m)
Open Source Code No The paper references a third-party repository (https://github.com/google-research/realworldrl_suite) for a baseline implementation, but does not explicitly state that the code for their own proposed methodology (Algorithm 1) is open-source or publicly available.
Open Datasets Yes We test the noise variation and empirical performance of our moment estimation algorithm on policy optimization and neural network training. In this section, we present three sets of experiments: synthetic least squares, policy optimization for mujoco tasks and neural network training on Cifar10 dataset.
Dataset Splits No The paper mentions using the Cifar10 dataset and Mujoco tasks, but does not specify the exact training, validation, and test dataset splits (e.g., percentages or absolute counts) or explicitly reference standard splits with citations for reproducibility.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions software like ADAM (Kingma & Ba, 2014) and RMSProp (Tieleman & Hinton, 2012) in relation to their method's similarity to them, and the use of 'sklearn library' for synthetic experiments, but it does not specify version numbers for any of the software dependencies used in their experiments.
Experiment Setup No The paper mentions 'fine-tuning the step sizes for each algorithm by grid-searching' in synthetic experiments and 'grid-searching among 10k, where k is an integer' for policy optimization, but it does not provide explicit numerical values for hyperparameters (e.g., learning rates, batch sizes, number of epochs) or other system-level training settings needed for reproducibility across all experiments.