Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On Training-Conditional Conformal Prediction and Binomial Proportion Confidence Intervals

Authors: Rudi Coppola, Manuel Mazo Espinosa

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate empirically equation 9 as follows and represent the results graphically in Figure 2. We define a list of values for E by Eq = 0.01 + 0.01 q for q = 0, ..., 98. For every value of Eq we consider an underlying Bernoulli distribution with parameter b1,q = Eq αEq < Eq (right figure) and an underlying Bernoulli distribution with parameter b2,q = E + αEq% > Eq (left figure) with α = 0.005. For every value of q = 0, ..., 98 we examine the two situations b1,q Eq and b2,q > Eq, as mentioned in Example 1 Part 2. The significance level ϵ is set to 2/3. We draw ncal = 5 104 pairs of calibration points {z(i) 1 , z(i) 2 }ncal i=1. For every pair of calibration points z(i) 1 , z(i) 2 we construct the resulting INP as Γϵ (i) .= Γϵ(z(i) 1 , z(i) 2 , ...), draw ntest = 5 104 test points {z(j) N+1}ntest i=j and compute the empirical frequency ˆgi = \|{j=1,...ntest:z(j) N+1 Γϵ (i)}\| ntest as an approximation for P(ZN+1 Γϵ (i)); finally we compute ˆh = \|{i=j,...ncal:ˆgi 1 E}\| ncal as an approximation to P2(SE) shown in the plots as the solid red line.
Researcher Affiliation	Academia	Rudi Coppola EMAIL Department of Mechanical Engineering Delft University of Technology Manuel Mazo Jr. EMAIL Department of Mechanical Engineering Delft University of Technology
Pseudocode	No	The paper describes methods and theoretical analyses but does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any explicit statement about providing access to source code for the methodology described.
Open Datasets	No	For every value of Eq we consider an underlying Bernoulli distribution... We draw ncal = 5 104 pairs of calibration points {z(i) 1 , z(i) 2 }ncal i=1. For every pair of calibration points z(i) 1 , z(i) 2 we construct the resulting INP as Γϵ (i) .= Γϵ(z(i) 1 , z(i) 2 , ...), draw ntest = 5 104 test points {z(j) N+1}ntest i=j and compute the empirical frequency... This indicates data is generated for simulation, not from a public dataset.
Dataset Splits	Yes	We draw ncal = 5 104 pairs of calibration points {z(i) 1 , z(i) 2 }ncal i=1. For every pair of calibration points z(i) 1 , z(i) 2 we construct the resulting INP as Γϵ (i) .= Γϵ(z(i) 1 , z(i) 2 , ...), draw ntest = 5 104 test points {z(j) N+1}ntest i=j and compute the empirical frequency ˆgi = \|{j=1,...ntest:z(j) N+1 Γϵ (i)}\| ntest as an approximation for P(ZN+1 Γϵ (i)); finally we compute ˆh = \|{i=j,...ncal:ˆgi 1 E}\| ncal as an approximation to P2(SE) shown in the plots as the solid red line.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	We define a list of values for E by Eq = 0.01 + 0.01 q for q = 0, ..., 98. For every value of Eq we consider an underlying Bernoulli distribution with parameter b1,q = Eq αEq < Eq (right figure) and an underlying Bernoulli distribution with parameter b2,q = E + αEq% > Eq (left figure) with α = 0.005. The significance level ϵ is set to 2/3. We draw ncal = 5 104 pairs of calibration points {z(i) 1 , z(i) 2 }ncal i=1. For every pair of calibration points z(i) 1 , z(i) 2 we construct the resulting INP as Γϵ (i) .= Γϵ(z(i) 1 , z(i) 2 , ...), draw ntest = 5 104 test points {z(j) N+1}ntest i=j and compute the empirical frequency ˆgi = \|{j=1,...ntest:z(j) N+1 Γϵ (i)}\| ntest as an approximation for P(ZN+1 Γϵ (i)); finally we compute ˆh = \|{i=j,...ncal:ˆgi 1 E}\| ncal as an approximation to P2(SE) shown in the plots as the solid red line.