Beyond L1: Faster and Better Sparse Models with skglm
Authors: Quentin Bertrand, Quentin Klopfenstein, Pierre-Antoine Bannier, Gauthier Gidel, Mathurin Massias
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide an extensive experimental comparison and we show state-of-the-art improvements on a wide range of convex and non-convex problems. |
| Researcher Affiliation | Collaboration | Quentin Bertrand Mila & Ude M, Canada quentin.bertrand@mila.quebec Quentin Klopfenstein Luxembourg Centre for Systems Biomedicine University of Luxembourg Esch-sur-Alzette, Luxembourg Pierre-Antoine Bannier Independent Researcher Gauthier Gidel Mila & Ude M, Canada CIFAR AI Chair Mathurin Massias Univ. Lyon, Inria, CNRS, ENS de Lyon, UCB Lyon 1, LIP UMR 5668, F-69342 Lyon, France |
| Pseudocode | Yes | Algorithm 1 skglm (proposed) input : X, β Rp, nout N, nin N, ws_size N, ϵ > 0 |
| Open Source Code | Yes | We release skglm, a flexible, scikit-learn compatible package, which easily handles customized datafits and penalties. |
| Open Datasets | Yes | We use datasets from libsvm4 (Fan et al. 2008, see table 2). |
| Dataset Splits | No | The paper states 'Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)? [Yes] See Section 3.', however, Section 3 describes the benchmarking process and dataset usage but does not explicitly detail train/validation/test splits, percentages, or cross-validation methodology for the models themselves. |
| Hardware Specification | No | Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A] |
| Software Dependencies | No | Our package relying on numpy and numba (Lam et al., 2015; Harris et al., 2020) is attached in the supplementary material. No version numbers are given for these dependencies. |
| Experiment Setup | Yes | skglm (Algorithm 1, ours), using M = 5 iterates for the Anderson extrapolation. |