Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Chebyshev-Cantelli PAC-Bayes-Bennett Inequality for the Weighted Majority Vote

Authors: Yi-Shan Wu, Andres Masegosa, Stephan Lorenzen, Christian Igel, Yevgeny Seldin

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6 Experiments We start with a simulated comparison of the oracle bounds and then present an empirical evaluation on real data. and Empirical evaluation on real datasets We studied the empirical performance of the bounds using standard random forest [Breiman, 2001] and a combination of heterogeneous classiﬁers on a subset of data sets from UCI and Lib SVM repositories [Dua and Graff, 2019, Chang and Lin, 2011].
Researcher Affiliation	Academia	Yi-Shan Wu University of Copenhagen EMAIL Andrés R. Masegosa University of Aalborg EMAIL Stephan S. Lorenzen University of Copenhagen EMAIL Christian Igel University of Copenhagen EMAIL Yevgeny Seldin University of Copenhagen EMAIL
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The python source code for replicating the experiments is available at Github2. 2https://github.com/StephanLorenzen/MajorityVoteBounds
Open Datasets	Yes	We studied the empirical performance of the bounds using standard random forest [Breiman, 2001] and a combination of heterogeneous classiﬁers on a subset of data sets from UCI and Lib SVM repositories [Dua and Graff, 2019, Chang and Lin, 2011].
Dataset Splits	Yes	For each data set, we set aside 20% of the data for the test set Stest and used the remaining data S for ensemble construction, weight optimization and bound evaluation. and the empirical losses ˆL(h, S) in the bounds are replaced by the validation losses ˆL(h, Sh), and the sample size n is replaced by the minimal validation size minh \|Sh\|. and we generate these splits by bagging, where out-of-bag (OOB) samples Sh provide unbiased estimates of expected losses of individual hypotheses h.
Hardware Specification	No	No specific hardware details (like GPU/CPU models, memory) are provided for the experimental setup.
Software Dependencies	No	The paper mentions 'The python source code' but does not specify Python version or any library/solver names with their version numbers.
Experiment Setup	Yes	We take 100 fully grown trees, use the Gini criterion for splitting, and consider d features in each split.