Safe Grid Search with Optimal Complexity

Authors: Eugene Ndiaye, Tam Le, Olivier Fercoq, Joseph Salmon, Ichiro Takeuchi

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate our method on ℓ1-regularized least squares and logistic regression by comparing the computational times and number of grid points needed to compute an ϵ-path for a given range [λmin, λmax] for several strategies. ... Results are reported in Figure 3 for classification and regression problem. Our approach leads to better guarantees for approximating the regularization path w.r.t. the default grid and often significant gain in computing time.
Researcher Affiliation Academia 1Riken AIP 2LTCI, T el ecom Paris Tech, Universit e Paris-Saclay 3IMAG, Univ Montpellier, CNRS, Montpellier, France 4Nagoya Institute of Technology.
Pseudocode Yes Algorithm 1 training path" and "Algorithm 2 ϵv-path for Validation Set
Open Source Code Yes Our implementation is available at https://github. com/Eugene Ndiaye/safe_grid_search.
Open Datasets Yes Our experiments were conducted on the leukemia dataset, available in sklearn and the climate dataset NCEP/NCAR Reanalysis (Kalnay et al., 1996).
Dataset Splits Yes It consists in splitting the data in two parts: on the first part (training set) the method is trained for a predefined collection of candidates ΛT := {λ0, . . . , λT 1}, and on the second part (validation set), the best parameter is selected among the T candidates." and "on the validation set (30% of the observations)
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies No The paper mentions software packages like 'sklearn' and 'glmnet' in the context of datasets and default grids, but it does not specify version numbers for any software dependencies.
Experiment Setup Yes The Default grid is the one used by default in the packages glmnet (Friedman et al., 2010) and sklearn (Pedregosa et al., 2011). It is defined as λt = λmax 10 δt/(T 1) (here δ = 3). ... We have used the same (vanilla) coordinate descent optimization solver with warm start between parameters for all grids. ... ϵ = 10 4 y 2 for the least-squares case and ϵ = 10 4 min(n1, n2)/n where ni is the number of observations in the class i {0, 1}, for the logistic case.