On the Bias-Variance-Cost Tradeoff of Stochastic Optimization

Authors: Yifan Hu, Xin Chen, Niao He

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In numerical experiments, we apply four MLMC gradient methods and LSGD on three problems, a synthetic problem with biased oracles, invariant least square, and invariant absolute regression. Figure 1 summarizes the optimal parameter setup that achieves the smallest average error over a certain number of trials under a given total budget for quadratic program and invariant least square.
Researcher Affiliation Academia Yifan Hu UIUC yifanhu3@illinois.edu Xin Chen UIUC xinchen@illinois.edu Niao He ETH Zürich niao.he@inf.ethz.ch Department of Industrial and Enterprise Systems Engineering, University of Illinois at Urbana-Champaign. Optimization and Decision Intelligence (ODI) Group, Department of Computer Science, ETH Zürich.
Pseudocode Yes Algorithm 1 SGD Framework Input: Number of iterations T, stepsizes {γt}T t=1, initialization point x1. 1: for t = 1 to T do 2: Construct a gradient estimator v(xt) of F(xt). 3: Update xt+1 = xt γtv(xt). 4: end for Output: {xt}T t=1.
Open Source Code Yes 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]
Open Datasets No The paper describes generating custom datasets for its experiments (e.g., "We generate 2000 sample from ξ..."), but does not provide access information (link, citation, or repository) for a publicly available dataset.
Dataset Splits No The paper mentions generating samples for experiments but does not provide specific details regarding train, validation, or test splits, or how the data was partitioned for evaluation.
Hardware Specification No The paper states in the checklist that resources used were included, but no specific hardware details (e.g., GPU/CPU models, memory) are provided within the main text of the paper.
Software Dependencies No The paper does not specify any software names with version numbers used for the experiments.
Experiment Setup Yes Figure 1: Top row: synthetic problem. Bottom row: invariant least square. LR : learning rate or stepsizes. Error : average error of last iterate. Each subfigure represents the best average last iterate error a method can achieve with truncation level L selected within {0, ..., 10}, geometry distribution with parameter p within {0.1, ..., 0.9}, and stepsizes.