Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
On Training-Conditional Conformal Prediction and Binomial Proportion Confidence Intervals
Authors: Rudi Coppola, Manuel Mazo Espinosa
TMLR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate empirically equation 9 as follows and represent the results graphically in Figure 2. We define a list of values for E by Eq = 0.01 + 0.01 q for q = 0, ..., 98. For every value of Eq we consider an underlying Bernoulli distribution with parameter b1,q = Eq αEq < Eq (right figure) and an underlying Bernoulli distribution with parameter b2,q = E + αEq% > Eq (left figure) with α = 0.005. For every value of q = 0, ..., 98 we examine the two situations b1,q Eq and b2,q > Eq, as mentioned in Example 1 Part 2. The significance level ϵ is set to 2/3. We draw ncal = 5 104 pairs of calibration points {z(i) 1 , z(i) 2 }ncal i=1. For every pair of calibration points z(i) 1 , z(i) 2 we construct the resulting INP as Γϵ (i) .= Γϵ(z(i) 1 , z(i) 2 , ...), draw ntest = 5 104 test points {z(j) N+1}ntest i=j and compute the empirical frequency ˆgi = |{j=1,...ntest:z(j) N+1 Γϵ (i)}| ntest as an approximation for P(ZN+1 Γϵ (i)); finally we compute ˆh = |{i=j,...ncal:ˆgi 1 E}| ncal as an approximation to P2(SE) shown in the plots as the solid red line. |
| Researcher Affiliation | Academia | Rudi Coppola EMAIL Department of Mechanical Engineering Delft University of Technology Manuel Mazo Jr. EMAIL Department of Mechanical Engineering Delft University of Technology |
| Pseudocode | No | The paper describes methods and theoretical analyses but does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statement about providing access to source code for the methodology described. |
| Open Datasets | No | For every value of Eq we consider an underlying Bernoulli distribution... We draw ncal = 5 104 pairs of calibration points {z(i) 1 , z(i) 2 }ncal i=1. For every pair of calibration points z(i) 1 , z(i) 2 we construct the resulting INP as Γϵ (i) .= Γϵ(z(i) 1 , z(i) 2 , ...), draw ntest = 5 104 test points {z(j) N+1}ntest i=j and compute the empirical frequency... This indicates data is generated for simulation, not from a public dataset. |
| Dataset Splits | Yes | We draw ncal = 5 104 pairs of calibration points {z(i) 1 , z(i) 2 }ncal i=1. For every pair of calibration points z(i) 1 , z(i) 2 we construct the resulting INP as Γϵ (i) .= Γϵ(z(i) 1 , z(i) 2 , ...), draw ntest = 5 104 test points {z(j) N+1}ntest i=j and compute the empirical frequency ˆgi = |{j=1,...ntest:z(j) N+1 Γϵ (i)}| ntest as an approximation for P(ZN+1 Γϵ (i)); finally we compute ˆh = |{i=j,...ncal:ˆgi 1 E}| ncal as an approximation to P2(SE) shown in the plots as the solid red line. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers. |
| Experiment Setup | Yes | We define a list of values for E by Eq = 0.01 + 0.01 q for q = 0, ..., 98. For every value of Eq we consider an underlying Bernoulli distribution with parameter b1,q = Eq αEq < Eq (right figure) and an underlying Bernoulli distribution with parameter b2,q = E + αEq% > Eq (left figure) with α = 0.005. The significance level ϵ is set to 2/3. We draw ncal = 5 104 pairs of calibration points {z(i) 1 , z(i) 2 }ncal i=1. For every pair of calibration points z(i) 1 , z(i) 2 we construct the resulting INP as Γϵ (i) .= Γϵ(z(i) 1 , z(i) 2 , ...), draw ntest = 5 104 test points {z(j) N+1}ntest i=j and compute the empirical frequency ˆgi = |{j=1,...ntest:z(j) N+1 Γϵ (i)}| ntest as an approximation for P(ZN+1 Γϵ (i)); finally we compute ˆh = |{i=j,...ncal:ˆgi 1 E}| ncal as an approximation to P2(SE) shown in the plots as the solid red line. |