reproducibilityindex.ai

Improving Expert Predictions with Conformal Prediction

Authors: Eleni Straitouri, Lequn Wang, Nastaran Okati, Manuel Gomez Rodriguez

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Simulation experiments using synthetic and real expert predictions demonstrate that our system may help experts make more accurate predictions and is robust to the accuracy of the classifier the conformal predictor relies on. 5. Experiments on Synthetic Data 6. Experiments on Real Data
Researcher Affiliation	Academia	1Max Planck Institute for Software Systems, Kaiserslautern, Germany 2Department of Computer Science, Cornell University, Ithaca, United States.
Pseudocode	Yes	Algorithm 1 Finding a near-optimal ˆα
Open Source Code	Yes	An open-source implementation of our system is available at https://github.com/Networks-Learning/improve-expertpredictions-conformal-prediction.
Open Datasets	Yes	We experiment with the dataset CIFAR10H (Peterson et al., 2019), which contains 10,000 natural images taken from the test set of the standard CIFAR10 (Krizhevsky et al., 2009).
Dataset Splits	Yes	For each prediction task, we generate 10,000 samples, pick 20% of these samples at random as test set, which we use to estimate the performance of our system, and also randomly split the remaining 80% into three disjoint subsets for training, calibration, and estimation, whose sizes we vary across experiments.
Hardware Specification	Yes	All algorithms ran on a Debian machine equipped with Intel Xeon E5-2667 v4 @ 3.2 GHz, 32GB memory and two M40 Nvidia Tesla GPU cards.
Software Dependencies	Yes	To implement our algorithms and run all the experiments on synthetic and real data, we used Py Torch 1.12.1, Num Py 1.20.1 and Scikit-learn 1.0.2 on Python 3.9.2.
Experiment Setup	Yes	We create a variety of synthetic prediction tasks, each with 20 features per sample and a varying number of label values n and difficulty. For each prediction task, we train a logistic regression model Pθ(Y \| X)... and we use three popular and highly accurate deep neural network classifiers trained on CIFAR-10, namely Res Net110 (He et al., 2016a), Pre Res Net-110 (He et al., 2016b) and Dense Net (Huang et al., 2017).