Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Additive Nonlinear Quantile Regression in Ultra-high Dimension
Authors: Ben Sherwood, Adam Maidman
JMLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The performance of the method is tested using Monte Carlo simulations, an analysis of fat content of meat conditional on a 100 channel spectrum of absorbances and predicting TRIM32 expression using gene expression data from the eyes of rats. |
| Researcher Affiliation | Collaboration | Ben Sherwood EMAIL School of Business University of Kansas Lawrence, KS 66045, USA Adam Maidman EMAIL Microsoft One Microsoft Way Redmond, WA 98052, USA |
| Pseudocode | Yes | 4. Algorithm The objective function Q(γ) is non-convex and for high-dimensional data a grid search approach is not reasonable. ... We contribute to the literature by proposing a coordinate descent algorithm for quantile regression with a nonconvex group penalty. |
| Open Source Code | Yes | In addition, our implementation is publicly available on CRAN (Sherwood and Maidman, 2020). |
| Open Datasets | Yes | The performance of the method is tested using Monte Carlo simulations, an analysis of fat content of meat conditional on a 100 channel spectrum of absorbances and predicting TRIM32 expression using gene expression data from the eyes of rats... Our analysis is limited to the 215 samples available from the R package faraway (Faraway, 2016)... Scheetz et al. (2006) used 31,042 different probe sets to analyze RNA from the eyes of 120 twelve-week old male rats. |
| Dataset Splits | Yes | For the first two settings, models are fit using 500 training samples. Then 1000 testing samples are generated from the same model... To compare the methods we randomly sample 200 of the 210 samples as training data and the other 10 samples are used as testing data... First the data is randomly partitioned into a training set of 100 observations and a testing set of 20 observations. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | Yes | The quantile and mean regression models are fit using the R packages rq Pen (Sherwood and Maidman, 2020) and grpreg (Breheny and Zeng, 2017), respectively... Ben Sherwood and Adam Maidman. rqpen: Penalized quantile regression 2.2.2, 2020. URL https://cran.r-project.org/package=rqPen. |
| Experiment Setup | Yes | For both MA-SCAD and QA-SCAD we set a = 3.7... In the following data analysis we used K = 20 and ϵ = .00001... All models are fit using B-splines with the training and testing covariates transformed using cubic B-splines with Jn = 3... For Setting IIB Jn was set to 5, while in all other settings we fixed Jn = 3... The tuning parameter λ is selected using BIC, as outlined in the previous section, and we set a = 3.7. |