Katyusha X: Simple Momentum Method for Stochastic Sum-of-Nonconvex Optimization
Authors: Zeyuan Allen-Zhu
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Figure 1: A simple illustration on minimizing f(x) = 1 2x (µI BB where B R1000 1000 is a random 1 matrix, and µ = λ1(BB ) + 0.5 λ1(BB ) λ2(BB ) . Such f(x) is a typical instance in stochastic PCA (Garber et al., 2016). Remark 1. In SVRG, the best learning rate is η = 0.4/L after tuning. Remark 2. We used η = 0.4/L for Katyusha Xw. We used η = 0.4/L and τ = 0.1 for Katyusha Xs. Remark 3. In the mini-batch experiment, we used η = 0.4b L and τ = 0.1. The parallel speed-up is in terms of achieving objective error 10 3, 10 5, 10 7, 10 9. |
| Researcher Affiliation | Collaboration | Microsoft Research AI. Correspondence to: Zeyuan Allen Zhu <zeyuan@csail.mit.edu>. |
| Pseudocode | No | The paper describes algorithms with updates in text but does not include a formally labeled 'Algorithm' or 'Pseudocode' block. |
| Open Source Code | No | Full and future versions can be found on https://arxiv. org/abs/1802.03866. This link is to the arXiv paper itself, not source code. |
| Open Datasets | No | A simple illustration on minimizing f(x) = 1 2x (µI BB where B R1000 1000 is a random 1 matrix, and µ = λ1(BB ) + 0.5 λ1(BB ) λ2(BB ) . Such f(x) is a typical instance in stochastic PCA (Garber et al., 2016). This describes a synthetic dataset without providing access information. |
| Dataset Splits | No | The paper uses a synthetic dataset for illustration but does not specify any train/validation/test splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | Remark 1. In SVRG, the best learning rate is η = 0.4/L after tuning. Remark 2. We used η = 0.4/L for Katyusha Xw. We used η = 0.4/L and τ = 0.1 for Katyusha Xs. Remark 3. In the mini-batch experiment, we used η = 0.4b L and τ = 0.1. |