reproducibilityindex.ai

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

LassoNet: A Neural Network with Feature Sparsity

Authors: Ismael Lemhadri, Feng Ruan, Louis Abraham, Robert Tibshirani

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply Lasso Net to a number of real-data problems and ﬁnd that it signiﬁcantly outperforms state-of-the-art methods for feature selection and regression. ... In this section, we show experimental results on real-world datasets.
Researcher Affiliation	Collaboration	Ismael Lemhadri EMAIL Department of Statistics, Stanford University, Stanford, U.S.A. ... Feng Ruan EMAIL Department of Statistics, University of California, Berkeley, USA ... Louis Abraham EMAIL Gematria Technologies, London, U.K. ... Robert Tibshirani EMAIL Departments of Biomedical Data Sciences, and Statistics, Stanford University, Stanford, U.S.A.
Pseudocode	Yes	The procedure is summarized in Alg. 1. ... The key novelty is a numerically eﬃcient algorithm for the proximal inner loop. We call the proposed algorithm Hier-Prox and detail it in Alg. 2. ... Algorithm 3 Training Lasso Net for Unsupervised Feature Selection ... Algorithm 4 Group Hierarchical Proximal Operator ... Algorithm 5 Lasso Net for Matrix Completion
Open Source Code	Yes	We have made the code for our algorithm and experiments available on a public website 1. 1. https://lassonet.ml ... Python code and documentation for Lasso Net is available at https://lassonet.ml, and R code will soon be available in the same website.
Open Datasets	Yes	Mice Protein Dataset consists of protein expression levels measured in the cortex of normal and trisomic mice who had been exposed to diﬀerent experimental conditions. ... (Higuera et al., 2015) ... MNIST and MNIST-Fashion consist of 28-by-28 grayscale images of hand-written digits and clothing items, respectively. ... ISOLET consists of preprocessed speech data ... COIL-20 consists of centered grayscale images of 20 objects. ... Smartphone Dataset for Human Activity Recognition consists of sensor data ... The remaining datasets were retrieved from the UCI Repository (Dua and Graﬀ, 2017).
Dataset Splits	Yes	We divide each data set randomly into train, validation and test with a 70-10-20 split.
Hardware Specification	Yes	All experiments were run on a single computer with NVIDIA Tesla K80 and Intel Xeon E5-2640.
Software Dependencies	No	The implementation was conducted in the Py Torch framework.
Experiment Setup	Yes	For all of the experiments, we use Adam optimizer with a learning rate of 10 3 to train the initial dense model. Then, we use vanilla gradient descent with momentum equal to 0.9 on the regularization path. ... We used a learning rate of 0.001 and early stopping criterion of 10. Although the hierarchy parameter could in principle be selected on a validation set as well, we have found that the default value M = 10 works well for a variety of datasets. The number of neurons in the hidden layer was varied within [d/3, 2d/3, d, 4d/3].