On Variance Reduction in Stochastic Gradient Descent and its Asynchronous Variants
Authors: Sashank J. Reddi, Ahmed Hefny, Suvrit Sra, Barnabas Poczos, Alexander J. Smola
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present our empirical results in this section. For our experiments, we study the problem of binary classification via l2-regularized logistic regression. |
| Researcher Affiliation | Academia | Sashank J. Reddi Carnegie Mellon University sjakkamr@cs.cmu.edu Ahmed Hefny Carnegie Mellon University ahefny@cs.cmu.edu Suvrit Sra Massachusetts Institute of Technology suvrit@mit.edu Barnab as P oczos Carnegie Mellon University bapoczos@cs.cmu.edu Alex Smola Carnegie Mellon University alex@smola.org |
| Pseudocode | Yes | ALGORITHM 1: GENERIC STOCHASTIC VARIANCE REDUCTION ALGORITHM Data: x0 2 Rd, 0 i = x0 8i 2 [n] , {1, . . . , n}, step size > 0 Randomly pick a IT = {i0, . . . , i T } where it 2 {1, . . . , n} 8 t 2 {0, . . . , T} ; for t = 0 to T do Update iterate as xt+1 xt rfit(xt) rfit( t ; At+1 = SCHEDULEUPDATE({xi}t+1 i=0, At, t, IT ) ; end return x T |
| Open Source Code | No | The paper states 'All the algorithms were implemented in C++ 2' but does not provide any link or explicit statement about making the source code available for the methodology described. |
| Open Datasets | Yes | We run our experiments on datasets from LIBSVM website3. ... 3http://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/binary.html |
| Dataset Splits | No | The paper mentions using specific datasets but does not provide details on training, validation, or test splits (e.g., percentages, sample counts, or citations to predefined splits). |
| Hardware Specification | Yes | All experiments were conducted on a Google Compute Engine n1-highcpu-32 machine with 32 processors and 28.8 GB RAM. |
| Software Dependencies | No | The paper states that 'All the algorithms were implemented in C++ 2' but does not provide specific software dependencies with version numbers (e.g., library names with versions). |
| Experiment Setup | Yes | In all our experiments, we set λ = 1/n. ... The epoch size m is chosen as 2n (as recommended in [10]) in all our experiments. ... A constant step size that gives the best convergence is chosen for the dataset. |