Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Doubly Robust Bayesian Inference for Non-Stationary Streaming Data with $\beta$-Divergences
Authors: Jeremias Knoblauch, Jack E. Jewson, Theodoros Damoulas
NeurIPS 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Reducing False Discovery Rates of CPS from over 90% to 0% on real world data, this offers the state of the art. ... Lastly, Section 5 showcases the substantial gains in performance of robust BOCPD when compared to its standard version on real world data in terms of both predictive error and CP detection. ... Figure 4 shows that robust BOCPD deals with outliers on-line. |
| Researcher Affiliation | Academia | Jeremias Knoblauch The Alan Turing Institute Department of Statistics University of Warwick Coventry, CV4 7AL EMAIL Jack Jewson Department of Statistics University of Warwick Coventry, CV4 7AL EMAIL Theodoros Damoulas The Alan Turing Institute Department of Computer Science & Department of Statistics University of Warwick Coventry, CV4 7AL EMAIL |
| Pseudocode | Yes | Stochastic Variance Reduced Gradient (SVRG) inference for BOCPD Input at time 0: Window & batch sizes W, B , b ; frequency m, prior θ0, #steps K, step size η s.t. W > B > b ; and denotes sampling without replacement for next observation yt at time t do for retained run-lengths r R(t) do if τr = 0 then if r < W then θr θ r Full Opt (ELBO(yt r:t)); τr m else if r W then θ r θr; τr Geom (B /(B + b )) B min(B , r) ganchor r 1 i I ELBO(θ r, yt i), where I Unif{0, . . . , min(r, W)}, |I| = B for j = 1, 2, . . . , K do b min(b , r) and e I Unif{0, . . . , min(r, W)} and |e I| = b gold r 1 i e I ELBO(θ r, yt i), gnew r 1 i e I ELBO(θr, yt i) θr θr + η gnew r gold r + ganchor r ; τr τr 1 r r + 1 for all r R(t); R(t) R(t) {0} |
| Open Source Code | Yes | Software and simulation code is available as part of a reproducibility award at https://github.com/alan-turing-institute/rbocpdms/. |
| Open Datasets | Yes | The well-log data set was first studied in Ruanaidh et al. [39] and has become a benchmark data set for univariate CP detection. ... we also analyze Nitrogen Oxide (NOX) levels across 29 stations in London using spatially structured Bayesian Vector Autoregressions [see 25]. |
| Dataset Splits | No | The paper mentions using specific datasets for experiments but does not provide details on training, validation, or test splits (e.g., percentages, sample counts, or cross-validation setup). |
| Hardware Specification | Yes | The robust version has more computational overhead than standard BOCPD, but still needs less than 0.5 seconds per observation using a 3.1 GHZ Intel i7 and 16GB RAM. |
| Software Dependencies | No | The paper mentions 'Python scipy’s L-BFSG-B optimization routine' but does not specify a version number for scipy or any other key software dependencies. |
| Experiment Setup | Yes | for which we initialize βp = 0.05 and βp = 0.005 for d = 1 and d = 29, respectively. ... In our experiments L is a bounded absolute loss. |