Stochastic Convex Optimization: Faster Local Growth Implies Faster Global Convergence
Authors: Yi Xu, Qihang Lin, Tianbao Yang
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we perform some experiments to demonstrate effectiveness of proposed algorithms. We use very large-scale datasets from libsvm website in experiments, including covtype.binary (n = 581012), real-sim (n = 72309), url (n = 2396130) for classification, million songs (n = 463715), E2006-tfidf (n = 16087), E2006-log1p (n = 16087) for regression. |
| Researcher Affiliation | Academia | 1Department of Computer Science, The University of Iowa, Iowa City, IA 52242, USA 2Department of Management Sciences, The University of Iowa, Iowa City, IA 52242, USA. |
| Pseudocode | Yes | Algorithm 1 ASSG-c(w0, K, t, D1, ϵ0) and Algorithm 2 the ASSG-r algorithm for solving (1) are provided. |
| Open Source Code | No | The paper does not provide any explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | We use very large-scale datasets from libsvm website in experiments, including covtype.binary (n = 581012), real-sim (n = 72309), url (n = 2396130) for classification, million songs (n = 463715), E2006-tfidf (n = 16087), E2006-log1p (n = 16087) for regression. |
| Dataset Splits | No | The paper mentions using datasets for experiments but does not provide specific training, validation, or test split percentages or sample counts, nor does it refer to standard predefined splits with citations. |
| Hardware Specification | No | The paper does not explicitly describe the hardware (e.g., specific GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using SAGA and SVRG++ (which are algorithms), but does not provide specific version numbers for any software dependencies, libraries, or programming languages used. |
| Experiment Setup | Yes | The regularization parameter λ is set to be 10^-4 in all tasks (We also perform the experiments with λ = 10^-2 and include the results in the supplement). We set γ = 1 in Huber loss and p = 1.5 in robust regression...We use a decreasing step size proportional to 1/τ (τ is the iteration index) in SSG...The value of D1 in both ASSG and RASSG is set to 100 for all problems...In implementing the RASSG, we restart every 5 stages with t increased by a factor of 1.15, 2 and 2 respectively for hinge loss, Huber loss and robust regression. We tune the parameter ω among {0.3, 0.6, 0.9, 1}. |