Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

STAR-Bets: Sequential TArget-Recalculating Bets for Tighter Confidence Intervals

Authors: Vaclav Voracek, Francesco Orabona

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Now we provide some experiments suggesting that -Bets yields shorter confidence intervals than alternative methods. Here, we provide a teaser of our experiments, while the extensive experimental evaluation is in Appendix C.
Researcher Affiliation	Collaboration	Václav Voráček Second Foundation EMAIL Francesco Orabona King Abdullah University of Science and Technology EMAIL
Pseudocode	Yes	Algorithm 1 Hoeffding testing Algorithm 2 Hoeffding testing Algorithm 3 Testing with Bets Algorithm 4 Testing with -Bets Algorithm 5 Bernstein testing Algorithm 6 Bernstein testing
Open Source Code	Yes	The code is available on github.
Open Datasets	No	First, we perform several experiments with Beta and Bernoulli distribution to quickly assess the competing methods.
Dataset Splits	No	For all the methods and every n 8, 16, . . . , 256, we have estimated the mean 1000ˆ of a fresh realization of the corresponding random variable and plotted the average distance to the mean.
Hardware Specification	No	The paper does not provide specific hardware details for the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	Yes	We always show STa R bet from Algorithm 4 with details from D. ... The majority of the experiments are made with δ 0.05... In the following experiments, we used Algorithm 4 with the details in Appendix D sweeping over exponentially spaced grid of values of c. ... We instantiate the 10 log 8αn/{t 1}^2 term as cmn/{t 1}^2 with c 1.