Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Additive Nonlinear Quantile Regression in Ultra-high Dimension

Authors: Ben Sherwood, Adam Maidman

JMLR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The performance of the method is tested using Monte Carlo simulations, an analysis of fat content of meat conditional on a 100 channel spectrum of absorbances and predicting TRIM32 expression using gene expression data from the eyes of rats.
Researcher Affiliation	Collaboration	Ben Sherwood EMAIL School of Business University of Kansas Lawrence, KS 66045, USA Adam Maidman EMAIL Microsoft One Microsoft Way Redmond, WA 98052, USA
Pseudocode	Yes	4. Algorithm The objective function Q(γ) is non-convex and for high-dimensional data a grid search approach is not reasonable. ... We contribute to the literature by proposing a coordinate descent algorithm for quantile regression with a nonconvex group penalty.
Open Source Code	Yes	In addition, our implementation is publicly available on CRAN (Sherwood and Maidman, 2020).
Open Datasets	Yes	The performance of the method is tested using Monte Carlo simulations, an analysis of fat content of meat conditional on a 100 channel spectrum of absorbances and predicting TRIM32 expression using gene expression data from the eyes of rats... Our analysis is limited to the 215 samples available from the R package faraway (Faraway, 2016)... Scheetz et al. (2006) used 31,042 diﬀerent probe sets to analyze RNA from the eyes of 120 twelve-week old male rats.
Dataset Splits	Yes	For the ﬁrst two settings, models are ﬁt using 500 training samples. Then 1000 testing samples are generated from the same model... To compare the methods we randomly sample 200 of the 210 samples as training data and the other 10 samples are used as testing data... First the data is randomly partitioned into a training set of 100 observations and a testing set of 20 observations.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	Yes	The quantile and mean regression models are ﬁt using the R packages rq Pen (Sherwood and Maidman, 2020) and grpreg (Breheny and Zeng, 2017), respectively... Ben Sherwood and Adam Maidman. rqpen: Penalized quantile regression 2.2.2, 2020. URL https://cran.r-project.org/package=rqPen.
Experiment Setup	Yes	For both MA-SCAD and QA-SCAD we set a = 3.7... In the following data analysis we used K = 20 and ϵ = .00001... All models are ﬁt using B-splines with the training and testing covariates transformed using cubic B-splines with Jn = 3... For Setting IIB Jn was set to 5, while in all other settings we ﬁxed Jn = 3... The tuning parameter λ is selected using BIC, as outlined in the previous section, and we set a = 3.7.