Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Adaptive Data Analysis for Growing Data

Authors: Neil Marchant, Benjamin I.P. Rubinstein

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our bound empirically outperforms baselines composed from static bounds. In a batched query setting, the asymptotic data requirements of our bound grows with the square-root of the number of adaptive queries for a fixed accuracy goal (assuming the ratio of final to initial data size is held constant). This improvement matches the improvement of bounds for static data [15] over the data splitting baseline. 4.2 Empirical Comparison with Alternative Guarantees We empirically compare our generalization bounds for growing data with baselines composed from bounds for static data.
Researcher Affiliation	Academia	Neil G. Marchant School of Computing & Information Systems University of Melbourne, Australia EMAIL Benjamin I. P. Rubinstein School of Computing & Information Systems University of Melbourne, Australia EMAIL
Pseudocode	Yes	Algorithm 1 Interaction between A and M Algorithm 2 Composition of Clipped Gaussian Mechanisms with z CDP Privacy Filter
Open Source Code	No	Our empirical results are obtained by evaluating mathematical expressions, so there is no data or code to release.
Open Datasets	No	Our empirical results are obtained by evaluating mathematical expressions, so there is no data or code to release.
Dataset Splits	No	We do not train or test models. When instantiating our bounds in Figures 3, 4 and 6, we specify parameter settings in the captions.
Hardware Specification	No	We do not conduct experiments that require significant compute resources.
Software Dependencies	No	Our empirical results are obtained by evaluating mathematical expressions, so there is no data or code to release.
Experiment Setup	No	We do not train or test models. When instantiating our bounds in Figures 3, 4 and 6, we specify parameter settings in the captions.