Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Prediction regions through Inverse Regression
Authors: Emilie Devijver, Emeline Perthame
JMLR 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The performances of the proposed estimators and prediction regions are also analyzed through a simulation study and compared with usual estimators. (Abstract) and The finite-sample performance of the proposed confidence and prediction regions are investigated in Section 5, which also includes a comparison with existing methods namely least squares and Lasso. (Section 1, last paragraph) and Section 5 is titled Simulations. |
| Researcher Affiliation | Academia | Emilie Devijver EMAIL Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG, 38000 Grenoble, France, Emeline Perthame EMAIL Hub de Bioinformatique et Biostatistique D epartement Biologie Computationnelle, Institut Pasteur, USR 3756 CNRS, Paris, France |
| Pseudocode | No | The paper describes mathematical models, theorems, and estimation procedures but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The R code to use the 3 compared methods on simulated data is available at https://research.pasteur.fr/fr/member/emeline-perthame/. |
| Open Datasets | No | Data are simulated according to an inverse regression model and forward parameters are deduced from Equation (6). (Section 5.1). This indicates simulated data, not a publicly available dataset. |
| Dataset Splits | Yes | For each simulated design, 1 000 learning datasets with dimension (N, D) are generated as well as 1 000 corresponding testing observations. |
| Hardware Specification | Yes | computation time (on log scale) required to compute the prediction region on a Mac Book Pro 2,9 GHz Intel Core i5 processor RAM 16 Go with programs written in R. |
| Software Dependencies | No | The paper mentions using 'glmnet R package' and 'R package matrixcalc' but does not specify version numbers for these packages or for R itself. |
| Experiment Setup | Yes | The response dimension L is varying in {1, 2, 5}. (Section 5.1); for D = 100, we consider a high-dimensional one with N = 50, an asymptotic one with N = 500 and an intermediate design with N = 100. We also study a design with D = 1000 and N = 100 (Section 5.1); In this simulation study, the level of confidence for prediction regions is set to 95%. (Section 5.1); By repeating this procedure B = 100 times, the distribution of the prediction is estimated. (Section 5.1) |