Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
PAC-Bayesian AUC classification and scoring
Authors: James Ridgway, Pierre Alquier, Nicolas Chopin, Feng Liang
NeurIPS 2014 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We now compare our PAC-Bayesian approach (computed with EP) with Bayesian logistic regression (to deal with non-identifiable cases), and with the rankboost algorithm [Freund et al., 2003] on different datasets1; note that Cortes and Mohri [2003] showed that the function optimised by rankbook is AUC. and Table 1: Comparison of AUC. |
| Researcher Affiliation | Academia | James Ridgway CREST and CEREMADE University Dauphine EMAIL Pierre Alquier CREST (ENSAE) EMAIL Nicolas Chopin CREST (ENSAE) and HEC Paris EMAIL Feng Liang University of Illinois at Urbana-Champaign EMAIL |
| Pseudocode | Yes | Algorithm 1 Tempering SMC |
| Open Source Code | No | No explicit statement about providing open-source code for the methodology described in this paper. |
| Open Datasets | Yes | All available at http://archive.ics.uci.edu/ml/ |
| Dataset Splits | Yes | As mentioned in Section 3, we set the prior hyperparameters by maximizing the evidence, and we use cross-validation to choose γ. |
| Hardware Specification | No | No mention of specific hardware used for experiments. |
| Software Dependencies | No | No specific software dependencies with version numbers are provided. |
| Experiment Setup | Yes | As mentioned in Section 3, we set the prior hyperparameters by maximizing the evidence, and we use cross-validation to choose γ. To ensure convergence of EP, when dealing with difficult sites, we use damping [Seeger, 2005]. |