Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Even Faster Accelerated Coordinate Descent Using Non-Uniform Sampling

Authors: Zeyuan Allen-Zhu, Zheng Qu, Peter Richtarik, Yang Yuan

ICML 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper, we improve the best known running time of accelerated coordinate descent by a factor up to n. Our improvement is based on a clean, novel non-uniform sampling that selects each coordinate with a probability proportional to the square root of its smoothness parameter. Our proof technique also deviates from the classical estimation sequence technique used in prior work. Our speed-up applies to important problems such as empirical risk minimization and solving linear systems, both in theory and in practice.1
Researcher Affiliation	Academia	Zeyuan Allen-Zhu EMAIL Princeton University Zheng Qu EMAIL University of Hong Kong Peter Richt arik EMAIL University of Edinburgh Yang Yuan EMAIL Cornell University
Pseudocode	Yes	Algorithm 1 NU ACDM(β, f, x0, T) Input: β [0, 1]; f a convex function that is coordinate-wise smooth with parameters (L1, . . . , Ln), and σβ-strongly convex with respect to Lβ for some β [0, 1]; x0 some initial point; and T the number of iterations. Output: y T such that E[f(y T )] f(x ) O((1 τ)T ) (f(x0) f(x )).
Open Source Code	No	The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We consider three datasets in this section: (1) class 1 of the news20 dataset (15, 935 samples and 62, 061 features), (2) the w8a dataset (49, 749 samples and 300 features), and (3) the covtype dataset (581, 012 samples and 54 features). All of them can be found on the Lib SVM website (Fan & Lin), and contain examples that have non-uniform Euclidean norms (see Figure 3 in the appendix for the distribution).
Dataset Splits	No	The paper mentions datasets used but does not specify the exact training, validation, or test split percentages or sample counts. It refers to 'experiments' and 'performance comparisons' but lacks the detailed split information.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to conduct the experiments (e.g., GPU/CPU models, memory specifications).
Software Dependencies	No	The paper does not specify any software dependencies with version numbers that would be required to reproduce the experiments.
Experiment Setup	No	The paper describes the setup for experiments on ERM and solving linear systems, mentioning specific problem formulations (e.g., ridge regression, Lasso) and dataset properties, but it does not provide concrete hyperparameter values or system-level training settings in the main text.