Approximate Steepest Coordinate Descent
Authors: Sebastian U. Stich, Anant Raj, Martin Jaggi
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments with Lasso and Ridge regression show promising improvements, in line with our theoretical guarantees.In this section we evaluate the empirical performance of ASCD on synthetic and real datasets. |
| Researcher Affiliation | Academia | 1EPFL 2Max Planck Institute for Intelligent Systems. Correspondence to: Sebastian U. Stich <sebastian.stich@epfl.ch>. |
| Pseudocode | Yes | Algorithm 1 Approximate SCD (ASCD) Algorithm 2 Adaptation of ASCD for GS-q rule |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., a specific repository link or an explicit statement of code release) for the methodology described. |
| Open Datasets | Yes | For real datasets, we perform the experimental evaluation on RCV1 (binary,training), which consists of 20, 242 samples, each of dimension 47, 236 (Lewis et al., 2004). For the synthetic data, we follow the same generation procedure as described in (Nutini et al., 2015), which generates very sparse data matrices. For completeness, full details of the data generation process are also provided in the appendix in Sec. E. |
| Dataset Splits | No | The paper mentions the datasets used and their sizes but does not provide explicit details about train/validation/test splits, percentages, or specific counts for reproducibility. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiments. |
| Experiment Setup | Yes | We use exact line search for the experiment in Fig. 3c, for all others we used a fixed step size rule (the convergence is slower for all algorithms, but the different effects of the selection of the active coordinate is more distinctly visible). ASCD is either initialized with the true gradient (Figs. 2a, 2b, 2d, 3c, 3d) or arbitrarely (with error bounds δ = ) in Figs. 3a and 3b (Fig. 2c compares both initializations). For the l1-regularized problems, we used ASCD with the GS-s rule. |