A New Branch-and-Bound Pruning Framework for $\ell_0$-Regularized Problems

Authors: Theo Guyard, Cédric Herzet, Clément Elvira, Ayse-Nur Arslan

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show through numerical simulations that our pruning strategy can improve the solving time of Bn B procedures by several orders of magnitude for typical problems encountered in machinelearning applications. [...] 4. Numerical Experiments In this final section, we assess numerically the proposed pruning strategy to accelerate Bn B algorithms addressing problem (P). [...] We generate synthetic problem instances as described above by varying one parameter at a time to cover different working regimes. In Figure 4, we represent the acceleration factor in terms of solving time obtained by our novel pruning strategy.
Researcher Affiliation Academia 1Inria and Insa, Univ Rennes, CNRS, IRMAR UMR 6625, Rennes, France 2Ensai, Univ Rennes, CNRS, CREST UMR 9194, Rennes, France 3Centrale Sup elec, Univ Rennes, CNRS, IETR UMR 6164, Rennes, France 4Inria, Univ Bordeaux, CNRS, IMB UMR 5251, Bordeaux, France.
Pseudocode Yes Algorithm 1 Tree expansion based on simultaneous pruning
Open Source Code Yes Reproducibility The research presented in this paper is reproducible. The associated code is open-sourced10 and all the datasets used in our simulations are publicly available. Footnote 10: https://github.com/Theo Guyard/El0ps
Open Datasets Yes The associated code is open-sourced10 and all the datasets used in our simulations are publicly available. [...] We use the instances of y and A provided by the RIBOFLAVIN (Bühlmann et al., 2014) and BCTCGA (Liu et al., 2018) datasets [...] We use instances of y and A from the COLON CANCER (Alon et al., 1999) and LEUKEMIA (Golub et al., 1999) datasets [...] We use instances of y and A from the datasets BREAST CANCER and ARCENE (Chang & Lin, 2011).
Dataset Splits No The paper discusses instance generation and hyperparameter calibration but does not explicitly provide details about training, validation, or test dataset splits (e.g., percentages or sample counts).
Hardware Specification Yes Computations were carried out using the Grid 5000 testbed, supported by a scientific interest group hosted by INRIA and including CNRS, RENATER and several universities as well as other organizations.11 Experiments were run on a Debian 10 operating system, featuring one Intel Xeon E5-2660 v3 CPU clocked at 2.60 GHz with 16 GB of RAM.
Software Dependencies No The paper mentions 'Mosek, Cplex and Gurobi which are off-the-shelf Mixed Integer Program (MIP) solvers' and the 'l0learn package', but does not provide specific version numbers for these software components. It also mentions 'Debian 10 operating system' but this is an OS, not an ancillary software dependency with version.
Experiment Setup Yes Instance generation For each problem instance, we generate the rows of A Rm n as independent realizations of a multivariate normal distribution with zero mean and covariance matrix Σ Rn n where each entry (i, j) is defined as Σij = ρ|i j| for some ρ [0, 1). Moreover, we set y = Ax + e where x Rn has k evenlyspaced non-zero entries of unit amplitude in absolute value and where e Rm is a zero-mean Gaussian noise with a variance tuned to obtain some signal-to-noise ratio τ = 10 log10( Ax 2 2/ e 2 2). We generate each problem instance as described above with the parameters k = 5, m = 500, n = 1000, ρ = 0.9 and τ = 10. [...] We use an approach similar to that considered in (Hazimeh et al., 2022) to solve numerically problem (Rν). More precisely, we solve a sequence of sub-problems defined as inf f(Ax) + (gν) (x) s.t. xi = 0 i / W (Rν W) where W J1, n K is some working set. Each sub-problem (Rν W) is solved by using a classical coordinate descent procedure. [...] To calibrate the value of λ in (P) and the hyperparameters of the function h( ), we use the l0learn package (Dedieu et al., 2021) that can approximately solve some specific instances of the problem. More specifically, we call the cv.fit procedure to perform a grid search over the values of λ and the hyperparameters of the function h( ).