Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
LassoNet: A Neural Network with Feature Sparsity
Authors: Ismael Lemhadri, Feng Ruan, Louis Abraham, Robert Tibshirani
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply Lasso Net to a number of real-data problems and find that it significantly outperforms state-of-the-art methods for feature selection and regression. ... In this section, we show experimental results on real-world datasets. |
| Researcher Affiliation | Collaboration | Ismael Lemhadri EMAIL Department of Statistics, Stanford University, Stanford, U.S.A. ... Feng Ruan EMAIL Department of Statistics, University of California, Berkeley, USA ... Louis Abraham EMAIL Gematria Technologies, London, U.K. ... Robert Tibshirani EMAIL Departments of Biomedical Data Sciences, and Statistics, Stanford University, Stanford, U.S.A. |
| Pseudocode | Yes | The procedure is summarized in Alg. 1. ... The key novelty is a numerically efficient algorithm for the proximal inner loop. We call the proposed algorithm Hier-Prox and detail it in Alg. 2. ... Algorithm 3 Training Lasso Net for Unsupervised Feature Selection ... Algorithm 4 Group Hierarchical Proximal Operator ... Algorithm 5 Lasso Net for Matrix Completion |
| Open Source Code | Yes | We have made the code for our algorithm and experiments available on a public website 1. 1. https://lassonet.ml ... Python code and documentation for Lasso Net is available at https://lassonet.ml, and R code will soon be available in the same website. |
| Open Datasets | Yes | Mice Protein Dataset consists of protein expression levels measured in the cortex of normal and trisomic mice who had been exposed to different experimental conditions. ... (Higuera et al., 2015) ... MNIST and MNIST-Fashion consist of 28-by-28 grayscale images of hand-written digits and clothing items, respectively. ... ISOLET consists of preprocessed speech data ... COIL-20 consists of centered grayscale images of 20 objects. ... Smartphone Dataset for Human Activity Recognition consists of sensor data ... The remaining datasets were retrieved from the UCI Repository (Dua and Graff, 2017). |
| Dataset Splits | Yes | We divide each data set randomly into train, validation and test with a 70-10-20 split. |
| Hardware Specification | Yes | All experiments were run on a single computer with NVIDIA Tesla K80 and Intel Xeon E5-2640. |
| Software Dependencies | No | The implementation was conducted in the Py Torch framework. |
| Experiment Setup | Yes | For all of the experiments, we use Adam optimizer with a learning rate of 10 3 to train the initial dense model. Then, we use vanilla gradient descent with momentum equal to 0.9 on the regularization path. ... We used a learning rate of 0.001 and early stopping criterion of 10. Although the hierarchy parameter could in principle be selected on a validation set as well, we have found that the default value M = 10 works well for a variety of datasets. The number of neurons in the hidden layer was varied within [d/3, 2d/3, d, 4d/3]. |