Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling
Authors: Zeyuan Allen-Zhu, Zheng Qu, Peter Richtarik, Yang Yuan
ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this paper, we improve the best known running time of accelerated coordinate descent by a factor up to n. Our improvement is based on a clean, novel non-uniform sampling that selects each coordinate with a probability proportional to the square root of its smoothness parameter. Our proof technique also deviates from the classical estimation sequence technique used in prior work. Our speed-up applies to important problems such as empirical risk minimization and solving linear systems, both in theory and in practice.1 |
| Researcher Affiliation | Academia | Zeyuan Allen-Zhu ZEYUAN@CSAIL.MIT.EDU Princeton University Zheng Qu ZHENGQU@HKU.HK University of Hong Kong Peter Richt arik PETER.RICHTARIK@ED.AC.UK University of Edinburgh Yang Yuan YANGYUAN@CS.CORNELL.EDU Cornell University |
| Pseudocode | Yes | Algorithm 1 NU ACDM(β, f, x0, T) Input: β [0, 1]; f a convex function that is coordinate-wise smooth with parameters (L1, . . . , Ln), and σβ-strongly convex with respect to Lβ for some β [0, 1]; x0 some initial point; and T the number of iterations. Output: y T such that E[f(y T )] f(x ) O((1 τ)T ) (f(x0) f(x )). |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We consider three datasets in this section: (1) class 1 of the news20 dataset (15, 935 samples and 62, 061 features), (2) the w8a dataset (49, 749 samples and 300 features), and (3) the covtype dataset (581, 012 samples and 54 features). All of them can be found on the Lib SVM website (Fan & Lin), and contain examples that have non-uniform Euclidean norms (see Figure 3 in the appendix for the distribution). |
| Dataset Splits | No | The paper mentions datasets used but does not specify the exact training, validation, or test split percentages or sample counts. It refers to 'experiments' and 'performance comparisons' but lacks the detailed split information. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to conduct the experiments (e.g., GPU/CPU models, memory specifications). |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers that would be required to reproduce the experiments. |
| Experiment Setup | No | The paper describes the setup for experiments on ERM and solving linear systems, mentioning specific problem formulations (e.g., ridge regression, Lasso) and dataset properties, but it does not provide concrete hyperparameter values or system-level training settings in the main text. |