Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Online Locally Differentially Private Conformal Prediction via Binary Inquiries

Authors: Qiangqiang Zhang, Chenfei Gu, Xinwei Feng, Jinhan Xie, Ting Li

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive empirical evaluations on both simulated and realworld datasets demonstrate that the proposed method delivers accurate, stable, and privacy-preserving predictions across a range of dynamic environments.
Researcher Affiliation	Academia	1 Zhongtai Securities Institute for Financial Studies, Shandong University 2 School of Statistics and Data Science, Shanghai University of Finance and Economics 3 Yunnan Key Laboratory of Statistical Modeling and Data Analysis, Yunnan University
Pseudocode	Yes	Algorithm 1: Locally Randomized Binary Response (LRBR) ... Algorithm 2: Binary Private Online Conformal Prediction ... Algorithm 3: Binary Private Online Conformal Prediction (Classification)
Open Source Code	Yes	The implementation code and detailed instructions for reproducing the experimental results will be provided in the Supplemental Material submitted alongside the paper.
Open Datasets	Yes	We use the ELEC2 dataset [Harries et al., 1999]... We assess the classification performance of our method using the real WISDM dataset [Kwapisz et al., 2011].
Dataset Splits	No	We generate data similar to Barber et al. [2023] via xt N(0, I5) and yt = x t βt + εt for t = 1, . . . , 10,000, where βt R5 and εt is from a normal distribution and independent of xt. We consider four cases... The sample size is set to 10,000. Long-run coverage and interval width are evaluated over these data points and averaged across 200 simulation runs.
Hardware Specification	Yes	The experiments were conducted on a laptop equipped with 16GB RAM and an AMD Ryzen 9 7940HX processor with Radeon Graphics.
Software Dependencies	No	We construct a time-indexed data stream and implement the model ˆft using XGBoost. Rolling coverage is calculated using a sliding window of 200 points to capture short-term fluctuations, while long-run coverage reflects the cumulative average over all time steps.
Experiment Setup	Yes	Input: local nonconformity score S, private quantile q, response rate r ... Input: Data stream {(Xt, Yt)}t 1; Response rate rt (0, 1); Miscoverage level α (0, 1); Initialize W0 = 1, λ1 = 0, q1 = 0 ... The privacy parameter is set to ϵ = 0.5, 1, 2, 3, with the corresponding values of r provided in Table 4. We define the miscoverage level as α = 0.1.