Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FAB-PPI: Frequentist, Assisted by Bayes, Prediction-Powered Inference

Authors: Stefano Cortinovis, Francois Caron

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the benefits of FAB-PPI in real and synthetic examples. [...] We compare FAB-PPI and power-tuned FAB-PPI (FABPPI++) to classical inference, PPI and power-tuned PPI (PPI++) on both synthetic and real estimation problems. For FAB-PPI, we use (HS) and (N) to indicate the use of the horseshoe and Gaussian priors defined in Section 4.3. [...] For all experiments, we set α = 0.1 and report the average mean squared error (MSE), interval volume, and coverage over 1000 repetitions.
Researcher Affiliation	Academia	1Department of Statistics, University of Oxford. Correspondence to: Stefano Cortinovis <EMAIL>.
Pseudocode	Yes	Algorithm 1 summarises the steps of the FAB-PPI approach in a general convex estimation problem. [...] Algorithm 2 summarises the FAB-PPI approach under the squared loss...
Open Source Code	Yes	Code for reproducing the experiments is available at https: //github.com/stefanocortinovis/fab-ppi.
Open Datasets	Yes	We consider several estimation experiments using the datasets presented in Angelopoulos et al. (2023a) and briefly described in Section S5.1. [...] All of the datasets were downloaded from the examples provided as part of the ppi-py package (Angelopoulos et al., 2023b).
Dataset Splits	Yes	We sample two datasets, n labelled observations {(Xi, Yi)}n i=1 iid from P and N unlabelled observations { e Xi}N i=1 iid from PX. [...] For this experiment, we assume that N is infinite, set n = 200, and vary γ between 1.5 and 1.5. [...] For this experiment, we set N = 106 and vary n from 100 to 1000. [...] Each dataset comes with covariate/label/prediction triples {Xi, Yi, f(Xi)}N i=1, which we randomly split into two subsets with n labelled and N n unlabelled observations, for varying values of n.
Hardware Specification	Yes	All of the experiments presented here were run locally on an Intel Core i7-11850H CPU.
Software Dependencies	No	Code implementing the FAB-PPI method is written in Python and made available at https://github.com/ stefanocortinovis/fab-ppi. Comparisons with standard PPI are performed using the ppi-py package (Angelopoulos et al., 2023b).
Experiment Setup	Yes	For all experiments, we set α = 0.1 and report the average mean squared error (MSE), interval volume, and coverage over 1000 repetitions. [...] We sample two datasets, n labelled observations {(Xi, Yi)}n i=1 iid from P and N unlabelled observations { e Xi}N i=1 iid from PX. [...] In this experiment, we assume that N is infinite, set n = 200, and vary γ between 1.5 and 1.5. [...] For this experiment, we set N = 106 and vary n from 100 to 1000.