Dynamic Sasvi: Strong Safe Screening for Norm-Regularized Least Squares
Authors: Hiroaki Yamada, Makoto Yamada
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we show the efficacy of the proposed methods using real-world data. We compared the proposed methods with Gap Safe Sphere and Gap Safe Dome [7, 17], which are state-of-the-art dynamic safe screening methods. All methods were run on a Macbook Air with a 1.1 GHz quad-core Intel Core i5 CPU with 16 GB of RAM. All methods were implemented in C++ using the Accelerate framework 1. In this section, methods are evaluated on Lasso. Experiments on group Lasso are provided in the appendix. |
| Researcher Affiliation | Academia | Hiroaki Yamada Kyoto University hyamada@ml.ist.i.kyoto-u.ac.jp Makoto Yamada Kyoto University and RIKEN AIP myamada@i.kyoto-u.ac.jp |
| Pseudocode | Yes | Algorithm 1 Coordinate descent with Dynamic Sasvi for Lasso |
| Open Source Code | Yes | 1Source codes are in https://github.com/k8127i/2021Dynamic Sasvi |
| Open Datasets | Yes | We solved the Lasso problem using the leukemia dataset 2 [11] (dense data with 72 samples and 7128 features) and λ = 1 100 X y . We used cyclic coordinate descent as the iterative algorithm and screened the variables for 10 iterations each. Figure 2a shows the ratio of the uneliminated features at each iteration. As guaranteed theoretically, we can see that Dynamic Sasvi eliminates more variables in earlier steps compared with both Gap Safe Dome and Gap Safe Sphere. The figure also shows that the Dynamic EDPP, which is a relaxed version of Dynamic Sasvi, eliminates almost the same number of features as Dynamic Sasvi. Next, we compared the computation time of the path of the Lasso solutions for various values of λ. We used λj = 100 j /99 X y (j = 0, . . . , 99). We used the estimated primal solution for λj with scaling as the initial vector in the solver for the problem with λj+1. The iterative solver stops when the duality gap is smaller than ϵ(P(0) D(0)). Note that P(0) D(0) makes the stopping criterion independent of the data scale. We used the leukemia and RCV1 3 [13] datasets (sparse data with 23149 samples and 47236 features). We subsampled the data 50 times and ran all the methods for the same 50 subsamples. The subsampled data size was 50 for leukemia and 20000 for RCV1. |
| Dataset Splits | No | The paper mentions subsampling the data and using a duality gap as a stopping criterion, but does not explicitly provide details for a separate validation dataset split used for model tuning or hyperparameter selection. |
| Hardware Specification | Yes | All methods were run on a Macbook Air with a 1.1 GHz quad-core Intel Core i5 CPU with 16 GB of RAM. |
| Software Dependencies | No | All methods were implemented in C++ using the Accelerate framework 1. The paper mentions software names but does not provide specific version numbers for C++ or the Accelerate framework. |
| Experiment Setup | Yes | We used cyclic coordinate descent as the iterative algorithm and screened the variables for 10 iterations each. We used λj = 100 j /99 X y (j = 0, . . . , 99). We used the estimated primal solution for λj with scaling as the initial vector in the solver for the problem with λj+1. The iterative solver stops when the duality gap is smaller than ϵ(P(0) D(0)). |