Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Two-Layer Feature Reduction for Sparse-Group Lasso via Decomposition of Convex Sets

Authors: Jie Wang, Zhanqiu Zhang, Jieping Ye

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments in Section 6 on both synthetic and real data demonstrate that the speedup gained by the proposed screening rules in solving SGL and nonnegative Lasso can be orders of magnitude.
Researcher Affiliation Academia Jie Wang EMAIL Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei, Anhui, China; Zhanqiu Zhang EMAIL Department of Electronic Engineering and Information Science University of Science and Technology of China Hefei, Anhui, China; Jieping Ye EMAIL Department of Computational Medicine and Bioinformatics Department of Electrical Engineering and Computer Science University of Michigan Ann Arbor, MI 48109-2218, USA
Pseudocode Yes Algorithm 1 Guidelines for developing TLFre. 1: Given a pair of parameter values (λ, α), we estimate a region Θ that contains the dual optimum θ (λ, α) of (4). 2: We solve the following two optimization problems:
Open Source Code Yes The code is available at http://dpc-screening.github.io/.
Open Datasets Yes We perform experiments on two commonly used real data sets the Alzheimer s Disease Neuroimaging Initiative (ADNI) data set (http://adni.loni.usc.edu/) and the news20.binary (Chang and Lin, 2011) data set.
Dataset Splits Yes We generate two data sets with 1000 160000 entries: Synthetic 1 and Synthetic 2. We randomly divide the 160000 features into 16000 groups. [...] The training and test sets contain 60, 000 and 10, 000 images, respectively. We first randomly select 5000 images for each digit from the training set and get a data matrix X R784 50000. Then, in each trial, we randomly select an image from the testing set as the response y R784.
Hardware Specification No No specific hardware details (like exact GPU/CPU models, processor types, or memory amounts) used for running the experiments are mentioned in the paper.
Software Dependencies No We use sg Least R from the SLEP package (Liu et al., 2009) as the solver for SGL, which is one of the state-of-the-arts (Zhang et al., 2018b) [see Section G for a comparison between sg Least R and another popular solver (Lin et al., 2014)].
Experiment Setup Yes Given a data set, for illustrative purposes only, we select seven values of α from {tan(ψ) : ψ = 5 , 15 , 30 , 45 , 60 , 75 , 85 }. Then, for each value of α, we run TLFre along a sequence of 100 values of λ equally spaced on the logarithmic scale of λ/λα max from 1 to 0.01. [...] We use zero as the initial point.