Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Katyusha X: Simple Momentum Method for Stochastic Sum-of-Nonconvex Optimization
Authors: Zeyuan Allen-Zhu
ICML 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 1: A simple illustration on minimizing f(x) = 1 2x (µI BB where B R1000 1000 is a random 1 matrix, and µ = λ1(BB ) + 0.5 λ1(BB ) λ2(BB ) . Such f(x) is a typical instance in stochastic PCA (Garber et al., 2016). Remark 1. In SVRG, the best learning rate is η = 0.4/L after tuning. Remark 2. We used η = 0.4/L for Katyusha Xw. We used η = 0.4/L and τ = 0.1 for Katyusha Xs. Remark 3. In the mini-batch experiment, we used η = 0.4b L and τ = 0.1. The parallel speed-up is in terms of achieving objective error 10 3, 10 5, 10 7, 10 9. |
| Researcher Affiliation | Collaboration | Microsoft Research AI. Correspondence to: Zeyuan Allen Zhu <EMAIL>. |
| Pseudocode | No | The paper describes algorithms with updates in text but does not include a formally labeled 'Algorithm' or 'Pseudocode' block. |
| Open Source Code | No | Full and future versions can be found on https://arxiv. org/abs/1802.03866. This link is to the arXiv paper itself, not source code. |
| Open Datasets | No | A simple illustration on minimizing f(x) = 1 2x (µI BB where B R1000 1000 is a random 1 matrix, and µ = λ1(BB ) + 0.5 λ1(BB ) λ2(BB ) . Such f(x) is a typical instance in stochastic PCA (Garber et al., 2016). This describes a synthetic dataset without providing access information. |
| Dataset Splits | No | The paper uses a synthetic dataset for illustration but does not specify any train/validation/test splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | Remark 1. In SVRG, the best learning rate is η = 0.4/L after tuning. Remark 2. We used η = 0.4/L for Katyusha Xw. We used η = 0.4/L and τ = 0.1 for Katyusha Xs. Remark 3. In the mini-batch experiment, we used η = 0.4b L and τ = 0.1. |