Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Prediction regions through Inverse Regression

Authors: Emilie Devijver, Emeline Perthame

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The performances of the proposed estimators and prediction regions are also analyzed through a simulation study and compared with usual estimators. (Abstract) and The finite-sample performance of the proposed confidence and prediction regions are investigated in Section 5, which also includes a comparison with existing methods namely least squares and Lasso. (Section 1, last paragraph) and Section 5 is titled Simulations.
Researcher Affiliation Academia Emilie Devijver EMAIL Univ. Grenoble Alpes, CNRS, Grenoble INP, LIG, 38000 Grenoble, France, Emeline Perthame EMAIL Hub de Bioinformatique et Biostatistique D epartement Biologie Computationnelle, Institut Pasteur, USR 3756 CNRS, Paris, France
Pseudocode No The paper describes mathematical models, theorems, and estimation procedures but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes The R code to use the 3 compared methods on simulated data is available at https://research.pasteur.fr/fr/member/emeline-perthame/.
Open Datasets No Data are simulated according to an inverse regression model and forward parameters are deduced from Equation (6). (Section 5.1). This indicates simulated data, not a publicly available dataset.
Dataset Splits Yes For each simulated design, 1 000 learning datasets with dimension (N, D) are generated as well as 1 000 corresponding testing observations.
Hardware Specification Yes computation time (on log scale) required to compute the prediction region on a Mac Book Pro 2,9 GHz Intel Core i5 processor RAM 16 Go with programs written in R.
Software Dependencies No The paper mentions using 'glmnet R package' and 'R package matrixcalc' but does not specify version numbers for these packages or for R itself.
Experiment Setup Yes The response dimension L is varying in {1, 2, 5}. (Section 5.1); for D = 100, we consider a high-dimensional one with N = 50, an asymptotic one with N = 500 and an intermediate design with N = 100. We also study a design with D = 1000 and N = 100 (Section 5.1); In this simulation study, the level of confidence for prediction regions is set to 95%. (Section 5.1); By repeating this procedure B = 100 times, the distribution of the prediction is estimated. (Section 5.1)