Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Single and Multiple Change-Point Detection with Differential Privacy

Authors: Wanrong Zhang, Sara Krehbiel, Rui Tuo, Yajun Mei, Rachel Cummings

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we run several Monte Carlo experiments to validate our theoretical results for both the online and oﬀline settings. We consider data drawn from Bernoulli distributions, which satisﬁes our ﬁrst distributional assumption, as well as Gaussian and Gamma distributions, which satisfy our second distributional assumptions. Our oﬀline experiments are summarized in Figure 1, which shows that change-point detection is easier when P0 and P1 are further apart and harder when the privacy requirement is stronger (ϵ is smaller). Additionally, these experiments enhance our theoretical results, ﬁnding that Offline PCPD performs well even when we relax the assumptions required for our theoretical accuracy bounds by running our algorithm on imperfect hypotheses P0 and P1 that are closer together than the true distributions from which data are drawn. Figure 3 shows that Online PCPD also performs well, consistent with our theoretical guarantees. (...) In this section, we present results from Monte Carlo experiments designed to validate the theoretical results of previous sections. Our simulations consider both oﬀline (Section 5.1) and online settings (Section 5.2) for the canonical problems of detecting a change in the mean of Bernoulli or Gaussian distributions.
Researcher Affiliation	Academia	Wanrong Zhang EMAIL H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332, USA Sara Krehbiel EMAIL Department of Math and Computer Science Santa Clara University Santa Clara, CA 95053, USA Rui Tuo EMAIL Wm Michael Barnes 64 Department of Industrial and Systems Engineering Texas A&M University College Station, TX 77843, USA Yajun Mei EMAIL H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332, USA Rachel Cummings EMAIL H. Milton Stewart School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332, USA
Pseudocode	Yes	Algorithm 1 Report Noisy Max: Report Max(X, , {f1, . . . , fm}, ϵ) (...) Algorithm 2 Above Noisy Threshold: Above Thresh(X, , {f1, f2, . . .}, T, ϵ) (...) Algorithm 3 Oﬄine private change-point detector: Offline PCPD(X, P0, P1, ϵ, n) (...) Algorithm 4 Oﬄine private change-point detector: Offline PTCPD(X, P0, P1, ϵ, n, A) (...) Algorithm 5 Online private change-point detector: Online PCPD(X, P0, P1, ϵ, n, T) (...) Algorithm 6 Online private multiple change-point detector: Online PMCPD(X, P0, . . . , Pm, ϵ, n, T1, . . . , Tm)
Open Source Code	No	The paper does not contain any explicit statement about providing source code or a link to a code repository. The license mentioned (CC-BY 4.0) pertains to the paper itself, not an accompanying software release.
Open Datasets	No	Our simulations consider both oﬀline (Section 5.1) and online settings (Section 5.2) for the canonical problems of detecting a change in the mean of Bernoulli or Gaussian distributions. In the oﬀline setting, we additionally show that our algorithms can accurately detect changes in the variance of Gaussian distribution and detect changes in the shape parameter of a Gamma distribution. (...) Each simulation is characterized by a probability distribution family (Bernoulli, Gaussian, or Gamma), a distribution parameter that changes (mean, standard deviation, or shape), and a change magnitude (large, small, or underspeciﬁed).
Dataset Splits	No	The paper describes Monte Carlo simulations where data is generated based on specified distributions and parameters (e.g., "n = 200 observations where the true change occurs at time k = 100"). It does not use pre-existing datasets with defined training, validation, or test splits.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory specifications) used for running the experiments or simulations.
Software Dependencies	No	The paper does not provide specific details or version numbers for any software libraries, programming languages, or tools used in the implementation or experimentation.
Experiment Setup	Yes	We vary privacy parameter ϵ = 0.1, 0.5, 1 and ∞, representing the non-private case. For each of our simulations, we use n = 200 observations where the true change occurs at time k = 100. This process is repeated 104 times. (...) For the Gaussian distributions, we truncate the log-likelihoods in the main algorithm and call Offline PTCPD with A = 0.1. (...) The new challenge is to choose an appropriate sliding window size n and corresponding threshold T in order to achieve good overall accuracy. The window size of n = 200 used in the oﬀline simulations does not permit any threshold that reasonably controls both false positive and false negative rates, so we choose a larger window size of n = 700 and restrict our online simulations to ϵ = 0.5, 1, ∞. (...) For the Bernoulli model, this resulted in a choice of T = 220 for all values of ϵ = 0.5, 1, ∞. In the Gaussian model, we chose T = 8, 4.5, 100 for ϵ = 0.5, 1, ∞, respectively.