Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
No penalty no tears: Least squares in high-dimensional linear models
Authors: Xiangyu Wang, David Dunson, Chenlei Leng
ICML 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical exercises comparing our methods with penalizationbased approaches in simulations and data analyses illustrate the great potential of the proposed algorithms. |
| Researcher Affiliation | Academia | Xiangyu Wang EMAIL Department of Statistical Science, Duke University David Dunson EMAIL Department of Statistical Science, Duke University Chenlei Leng EMAIL Department of Statistics, University of Warwick |
| Pseudocode | Yes | Algorithm 1 The Least-squares Adaptive Thresholding (LAT) Algorithm Initialization: 1: Input (Y, X), d, δ Stage 1 : Pre-selection ... Algorithm 2 The Ridge Adaptive Thresholding (RAT) Algorithm Initialization: 1: Input (Y, X), d, δ, r Stage 1 : Pre-selection ... |
| Open Source Code | No | The paper does not provide a direct link or explicit statement about the public availability of its source code. |
| Open Datasets | No | The paper uses synthetic datasets and a student performance dataset. For the student performance dataset, it cites 'Cortez and Silva, 2008' but does not provide specific access information like a URL, DOI, or repository for public access. |
| Dataset Splits | Yes | We use one of the 10 parts as a test set, fit all the methods on the other 9 parts, and then record their prediction error (root mean square error, RMSE), model size and runtime on the test set. We repeat this procedure until each of the 10 parts has been used for testing. |
| Hardware Specification | No | No specific hardware details (like GPU/CPU models or types) are provided for the experiments. |
| Software Dependencies | No | The paper mentions using 'Matlab', 'glmnet (Friedman et al., 2010) for enet and lasso', and 'Sparse Reg (Zhou et al., 2012; Zhou and Lange, 2013) for scad and mc+'. However, it does not provide specific version numbers for these software components or libraries. |
| Experiment Setup | Yes | For RAT and LAT, d is set to 0.3 n. For RAT and lars Ridge, we adopt a 10-fold cross-validation procedure to tune the ridge parameter r for a better finite-sample performance, although the theory allows r to be fixed as a constant. For all hard-thresholding steps, we fix δ = 0.5. |