Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Conformal Prediction with Missing Values
Authors: Margaux Zaffran, Aymeric Dieuleveut, Julie Josse, Yaniv Romano
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using synthetic and data from critical care, we corroborate our theory and report improved performance of our methods. |
| Researcher Affiliation | Collaboration | 1Electricit e de France R&D, Palaiseau, France 2Pre Me DICa L project team, INRIA Sophia-Antipolis, Montpellier, France 3CMAP, Ecole polytechnique, Institut Polytechnique de Paris, Palaiseau, France 4Departments of Electrical Engineering and of Computer Science, Technion Israel Institute of Technology, Haifa, Israel. |
| Pseudocode | Yes | Algorithm 1 CP-MDA-Exact (with CQR), Algorithm 2 CP-MDA-Nested (with CQR), Algorithm 3 SCP on impute-then-predict, Algorithm 4 CP-MDA-Exact |
| Open Source Code | Yes | The code to reproduce our experiments is available on Git Hub. |
| Open Datasets | Yes | We consider 6 benchmark real data sets for regression: meps_19, meps_20, meps_21 (MEPS), bio, bike and concrete (Dua & Graff, 2017)..., MEPS. Medical expenditure panel survey. https://meps.ahrq.gov/mepsweb/data_stats/data_overview.jsp |
| Dataset Splits | Yes | Split CP (Papadopoulos et al., 2002; Lei et al., 2018) achieves Eq. (1) by keeping a hold-out set, the calibration set, used to evaluate the performance of a fixed predictive model. and The calibration size is fixed to 1000 and the test set contains 2000 points.... Also, Table 1 specifies 'Tr size' and 'Cal size' values. |
| Hardware Specification | No | The paper mentions training models like Neural Networks and using Scikit-learn for iterative regression but does not specify any particular hardware components such as GPU models, CPU models, or cloud computing resources used for the experiments. |
| Software Dependencies | No | The paper mentions 'iterative regression (iterative ridge implemented in Scikit-learn, Pedregosa et al. (2011))' and 'Neural Network (NN)' optimized using 'Adam Kingma & Ba (2014)', but it does not provide specific version numbers for Scikit-learn, PyTorch/TensorFlow (if used for NN), Python, or any other software dependencies. |
| Experiment Setup | Yes | The network is composed of three fully connected layers with a hidden dimension of 64, and Re LU activation functions. We use the pinball loss to estimate the conditional quantiles, with a dropout regularization of rate 0.1. The network is optimized using Adam Kingma & Ba (2014) with a learning rate equal to 0.0005. We tune the optimal number of epochs by cross validation, minimizing the loss function on the hold-out data points; the maximal number of epochs is set to 2000. |