Approximate Steepest Coordinate Descent

Authors: Sebastian U. Stich, Anant Raj, Martin Jaggi

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments with Lasso and Ridge regression show promising improvements, in line with our theoretical guarantees.In this section we evaluate the empirical performance of ASCD on synthetic and real datasets.
Researcher Affiliation Academia 1EPFL 2Max Planck Institute for Intelligent Systems. Correspondence to: Sebastian U. Stich <sebastian.stich@epfl.ch>.
Pseudocode Yes Algorithm 1 Approximate SCD (ASCD) Algorithm 2 Adaptation of ASCD for GS-q rule
Open Source Code No The paper does not provide any concrete access information (e.g., a specific repository link or an explicit statement of code release) for the methodology described.
Open Datasets Yes For real datasets, we perform the experimental evaluation on RCV1 (binary,training), which consists of 20, 242 samples, each of dimension 47, 236 (Lewis et al., 2004). For the synthetic data, we follow the same generation procedure as described in (Nutini et al., 2015), which generates very sparse data matrices. For completeness, full details of the data generation process are also provided in the appendix in Sec. E.
Dataset Splits No The paper mentions the datasets used and their sizes but does not provide explicit details about train/validation/test splits, percentages, or specific counts for reproducibility.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiments.
Experiment Setup Yes We use exact line search for the experiment in Fig. 3c, for all others we used a fixed step size rule (the convergence is slower for all algorithms, but the different effects of the selection of the active coordinate is more distinctly visible). ASCD is either initialized with the true gradient (Figs. 2a, 2b, 2d, 3c, 3d) or arbitrarely (with error bounds δ = ) in Figs. 3a and 3b (Fig. 2c compares both initializations). For the l1-regularized problems, we used ASCD with the GS-s rule.