Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Saturating Splines and Feature Selection
Authors: Nicholas Boyd, Trevor Hastie, Stephen Boyd, Benjamin Recht, Michael I. Jordan
JMLR 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate the effectiveness of the method with several examples in 7. ... In all examples we affinely preprocess the data so that all training features lie in [0, 1], and apply the same transformation to the test features (which thus may have values outside of [0, 1]). For the bone density and abalone data sets we select τ to minimize error on the validation sets. For the Spam and ALS data sets we use cross-validation to estimate τ. We hold out a random subset of size 100 from the training set and train on the remaining data. For each random validation/train split we estimate τ to minimize hold-out error and take our final estimate of τ as the mean over 50 trials. |
| Researcher Affiliation | Academia | Nicholas Boyd EMAIL Department of Statistics University of California Berkeley, CA 94720-1776, USA Trevor Hastie EMAIL Department of Statistics Stanford University Stanford, CA 94305, USA Stephen Boyd EMAIL Department of Electrical Engineering Stanford University Stanford, CA 94305, USA Benjamin Recht EMAIL Department of Electrical Engineering and Computer Science University of California Berkeley, CA 94720-1776, USA Michael I. Jordan EMAIL Division of Computer Science and Department of Statistics University of California Berkeley, CA 94720-1776, USA |
| Pseudocode | Yes | Algorithm 1 Fully-corrective conditional gradient method For m = 1, . . . 1. Linearize: fˆ(s; xm) f(xm) + f (s xm; xm). 2. Minimize: sm arg mins C fˆ(s; xm). 3. Update: xm arg minx conv(s1,...,sm) f(s). |
| Open Source Code | Yes | Appendix A. Implementation Details We provide a simple, unoptimized implementation in the Rust language. |
| Open Datasets | Yes | We start with a simple univariate data set from (Hastie et al., 2001, 5.4). ... We fit a generalized additive model with saturating spline coordinate functions to the Abalone data set from the UCI Machine Learning Repository (Lichman, 2013). ... We consider the problem of classifying email into spam/not spam, with a data set taken from ESL (Hastie et al., 2001). ... Using this data set we try to predict the rate of progression of ALS (amyotrophic lateral sclerosis) in medical patients, as measured by the rate of change in their functional rating score, a measurement of functional impairment. ... Following (Efron and Hastie, 2016, 17.2), we measure performance using mean-squared error. |
| Dataset Splits | Yes | There are 259 data points, of which we hold out 120 for validation, leaving 139 data points to which we fit a saturating spine. ... We hold out 400 data points as a validation set, leaving 3777 data points to fit the model. ... we use the standard train/validation split, with a training set of size 3065, and test set with 1536 samples. ... The data set is split into a training set of 1197 examples and a validation set of 625 additional patients. ... We hold out a random subset of size 100 from the training set and train on the remaining data. For each random validation/train split we estimate τ to minimize hold-out error and take our final estimate of τ as the mean over 50 trials. |
| Hardware Specification | No | The paper does not contain specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions "We provide a simple, unoptimized implementation in the Rust language." but does not specify a version number for Rust or any libraries used. It also refers to GLMNet (Friedman et al., 2010) but not with specific version numbers for its own implementation. |
| Experiment Setup | Yes | In all examples we affinely preprocess the data so that all training features lie in [0, 1], and apply the same transformation to the test features (which thus may have values outside of [0, 1]). ... For our experiment we take δ = 0.0015; roughly speaking, the transition between square and linear loss occurs around δ = 0.039. ... For the bone density and abalone data sets we select τ to minimize error on the validation sets. For the Spam and ALS data sets we use cross-validation to estimate τ. We hold out a random subset of size 100 from the training set and train on the remaining data. For each random validation/train split we estimate τ to minimize hold-out error and take our final estimate of τ as the mean over 50 trials. ... We estimate the optimal value of τ using cross validation with a hold-out size of 100 examples and 50 samples; this procedure suggests τ = 13. |