Breaking the Nonsmooth Barrier: A Scalable Parallel Method for Composite Optimization
Authors: Fabian Pedregosa, Rémi Leblond, Simon Lacoste-Julien
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We prove that our method achieves a theoretical linear speedup with respect to the sequential version under assumptions on the sparsity of gradients and block-separability of the proximal term. Empirical benchmarks on a multi-core architecture illustrate practical speedups of up to 12x on a 20-core machine. |
| Researcher Affiliation | Academia | Fabian Pedregosa INRIA/ENS Paris, France Remi Leblond INRIA/ENS Paris, France Simon Lacoste-Julien MILA and DIRO Universit e de Montr eal, Canada DI Ecole normale sup erieure, CNRS, PSL Research University |
| Pseudocode | Yes | Algorithm 1 PROXASAGA (analyzed) ... Algorithm 2 PROXASAGA (implemented) |
| Open Source Code | Yes | A reference C++/Python implementation of is available at https://github.com/fabianp/ProxASAGA |
| Open Datasets | Yes | Table 1: Description of datasets. Dataset n p density L Δ KDD 2010 (Yu et al., 2010) 19,264,097 1,163,024 10 6 28.12 0.15 KDD 2012 (Juan et al., 2016) 149,639,105 54,686,452 2 10 7 1.25 0.85 Criteo (Juan et al., 2016) 45,840,617 1,000,000 4 10 5 1.25 0.89 |
| Dataset Splits | No | The paper uses large-scale datasets and evaluates convergence and speedup, but does not explicitly describe specific train/validation/test dataset splits, ratios, or cross-validation methodology. |
| Hardware Specification | No | The paper mentions 'on a multi-core architecture' and 'on a 20-core machine' but does not provide specific details such as CPU/GPU models, memory, or other hardware specifications used for the experiments. |
| Software Dependencies | No | The paper mentions a 'C++/Python implementation' but does not provide specific version numbers for any software dependencies, libraries, or compilers used in the experiments. |
| Experiment Setup | Yes | The objective function takes the form i=1 log 1 + exp( bia i x) + λ1 2 x 2 2 + λ2 x 1 , where ai Rp and bi { 1, +1} are the data samples. Following Defazio et al. (2014), we set λ1 = 1/n. The amount of 1 regularization (λ2) is selected to give an approximate 1/10 nonzero coefficients. ... We use the following step size: 1/2L for PROXASAGA, 1/Lc for ASYSPCD, where Lc is the coordinate-wise Lipschitz constant of the gradient, while FISTA uses backtracking line-search. |