Conformal Prediction with Missing Values
Authors: Margaux Zaffran, Aymeric Dieuleveut, Julie Josse, Yaniv Romano
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Using synthetic and data from critical care, we corroborate our theory and report improved performance of our methods. |
| Researcher Affiliation | Collaboration | 1Electricit e de France R&D, Palaiseau, France 2Pre Me DICa L project team, INRIA Sophia-Antipolis, Montpellier, France 3CMAP, Ecole polytechnique, Institut Polytechnique de Paris, Palaiseau, France 4Departments of Electrical Engineering and of Computer Science, Technion Israel Institute of Technology, Haifa, Israel. |
| Pseudocode | Yes | Algorithm 1 CP-MDA-Exact (with CQR), Algorithm 2 CP-MDA-Nested (with CQR), Algorithm 3 SCP on impute-then-predict, Algorithm 4 CP-MDA-Exact |
| Open Source Code | Yes | The code to reproduce our experiments is available on Git Hub. |
| Open Datasets | Yes | We consider 6 benchmark real data sets for regression: meps_19, meps_20, meps_21 (MEPS), bio, bike and concrete (Dua & Graff, 2017)..., MEPS. Medical expenditure panel survey. https://meps.ahrq.gov/mepsweb/data_stats/data_overview.jsp |
| Dataset Splits | Yes | Split CP (Papadopoulos et al., 2002; Lei et al., 2018) achieves Eq. (1) by keeping a hold-out set, the calibration set, used to evaluate the performance of a fixed predictive model. and The calibration size is fixed to 1000 and the test set contains 2000 points.... Also, Table 1 specifies 'Tr size' and 'Cal size' values. |
| Hardware Specification | No | The paper mentions training models like Neural Networks and using Scikit-learn for iterative regression but does not specify any particular hardware components such as GPU models, CPU models, or cloud computing resources used for the experiments. |
| Software Dependencies | No | The paper mentions 'iterative regression (iterative ridge implemented in Scikit-learn, Pedregosa et al. (2011))' and 'Neural Network (NN)' optimized using 'Adam Kingma & Ba (2014)', but it does not provide specific version numbers for Scikit-learn, PyTorch/TensorFlow (if used for NN), Python, or any other software dependencies. |
| Experiment Setup | Yes | The network is composed of three fully connected layers with a hidden dimension of 64, and Re LU activation functions. We use the pinball loss to estimate the conditional quantiles, with a dropout regularization of rate 0.1. The network is optimized using Adam Kingma & Ba (2014) with a learning rate equal to 0.0005. We tune the optimal number of epochs by cross validation, minimizing the loss function on the hold-out data points; the maximal number of epochs is set to 2000. |