reproducibilityindex.ai

Statistical inference with implicit SGD: proximal Robbins-Monro vs. Polyak-Ruppert

Authors: Yoonhyung Lee, Sungdong Lee, Joong-Ho Won

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6. Experiments Following Bach & Moulines (2011), we examined the convergence behavior of prox RM and prox PR using two univariate functions: L(θ) = 1/2θ^2 (strongly convex) and L(θ) = 1/4θ^4 (non-strongly convex)... Figs. 1 and 2 plot the squared estimation... Table 2 collects the results. Table 3 summarizes the results.
Researcher Affiliation	Collaboration	1Kakao Entertainment Corp. 2Department of Statistics, Seoul National University.
Pseudocode	No	The paper describes algorithms using mathematical equations, e.g., 'θn = θn-1 - γn ℓ(Zn, θn)', but does not include structured 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	No explicit statement about making code open source or providing a link to a code repository was found.
Open Datasets	No	We generated Zn = (yn, xn) where yn = x^T nθ + ϵn, xn ~ N(0, Σ), and ϵn ~ N(0, 1)... we instead used a smoothed version... and let Z ~ N(0, 1).
Dataset Splits	No	The paper describes the generation of synthetic data and the number of iterations (e.g., '100 independent runs of n = 10^6 ISGD iterations'), but does not specify explicit training/validation/test dataset splits.
Hardware Specification	No	No specific hardware details (like GPU or CPU models, memory, or cloud instance types) used for experiments are mentioned in the paper.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) that were used for the experiments.
Experiment Setup	Yes	We fixed the initial point θ0 = 10 for the quadratic and θ0 = 2 for the quartic function, and observed 100 independent runs of n = 10^6 ISGD iterations for initial step size γ1 {1/5, 1, 5, 20, 100} and exponent γ {1/5, 1/3, 2/5, 1/2, 2/3, 1}. We fixed θ = (1, . . . , 1)^T and ran n = 10^5 iterations of ISGD for γ {0.6, 1.0}, p {5, 20, 100, 200} with θ0 = 0 for each type of Σ. The n = 10^6 iterations were started with θ0 = 0 for each replication, where γ {0.6, 1} and µ {10^-1, 10^-2, 10^-3}; we used γ1 = 250 when γ = 1 and γ1 = 30 when γ = 0.6.