Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Non-parametric Quantile Regression via the K-NN Fused Lasso
Authors: Steven Siwei Ye, Oscar Hernan Madrid Padilla
JMLR 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments on simulated and real data demonstrate clear advantages of the proposed estimator over state of the art methods. All codes that implement the algorithms and the datasets used in the experiments are publicly available on the author s Github page (https://github.com/stevenysw/qt_knnfl). In Section 5, we list the results of numerical experiments on multiple simulated datasets and two real datasets, California housing data and Chicago crime data. The experiments show that the proposed estimator outperform state-of-the-art methods on both simulated and real datasets. |
| Researcher Affiliation | Academia | Steven Siwei Ye EMAIL Department of Statistics University of California, Los Angeles Los Angeles, CA 90095, USA Oscar Hernan Madrid Padilla EMAIL Department of Statistics University of California, Los Angeles Los Angeles, CA 90095, USA |
| Pseudocode | Yes | Algorithm 1: Alternating Directions Method of Multipliers for quantile K-NN fused lasso ... Algorithm 2: Majorize-Minimize for quantile K-NN fused lasso, τ = 0.5 |
| Open Source Code | Yes | All codes that implement the algorithms and the datasets used in the experiments are publicly available on the author s Github page (https://github.com/stevenysw/qt_knnfl). |
| Open Datasets | Yes | Numerical experiments on simulated and real data... 5.2.1 California Housing Data... is publicly available from the Carnegie Mellon Stat Lib data repository (lib.stat.cmu.edu). 5.2.2 Chicago Crime Data... a dataset of publicly-available crime report counts in Chicago, Illinois in 2015. |
| Dataset Splits | Yes | 5.2.1 California Housing Data: We perform 100 train-test random splits the data, with training sizes 1000, 5000, and 10000. For each split the data not in the training set is treated as testing data. ... 5.2.2 Chicago Crime Data: ... we perform a train-test split with training size 500, 1000, 1500, and 2000 |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for its experiments. It discusses computational time but does not mention specific CPU, GPU models, or memory. |
| Software Dependencies | No | For quantile random forest, we directly use the R package quantregForest with defaulted choice of tree structure and tuning parameters. This mentions a software package but does not provide a specific version number, which is required for a reproducible description. |
| Experiment Setup | Yes | For quantile K-NN fused lasso, we use the ADMM algorithm and select the tuning parameter λ based on the BIC criteria described in Section 3.3... For quantile random forest, we directly use the R package quantregForest with defaulted choice of tree structure and tuning parameters. Throughout, for both K-NN fused lasso and quantile K-NN fused lasso, we set K to be 5 for sufficient information and efficient computation. |