Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Robust Bayesian Classification Using An Optimistic Score Ratio
Authors: Viet Anh Nguyen, Nian Si, Jose Blanchet
ICML 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We showcase the power of the proposed optimistic score ratio classifier on both synthetic and empirical data. |
| Researcher Affiliation | Academia | 1Stanford University. Correspondence to: Viet Anh Nguyen <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Optimistic score ratio classification |
| Open Source Code | Yes | All experiments are run on a standard laptop with 1.4 GHz Intel Core i5 and 8GB of memory, the codes and datasets are available at https://github.com/nian-si/bsc. |
| Open Datasets | Yes | We test the performance of our classification rules on various datasets from the UCI repository (Dua & Graff, 2017). and the reference: Dua, D. and Graff, C. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml. |
| Dataset Splits | Yes | for all methods that need cross-validation, we randomly select 75% of the data for training and the remaining 25% for testing. The size of the ambiguity sets and the regularization parameter are selected using stratified 5-fold cross-validation. |
| Hardware Specification | Yes | All experiments are run on a standard laptop with 1.4 GHz Intel Core i5 and 8GB of memory |
| Software Dependencies | No | The paper does not specify software dependencies with version numbers. |
| Experiment Setup | Yes | We tune the threshold to maximize the training accuracy following (13) after computing the ratio value for each training sample. and for the second criteria, we choose ρc = n 1 c χ2 α(d(d + 3)/2) c {0, 1}, where nc is the number of training samples in class c and χ2 α(d(d + 3)/2) is the α-quantile of the chi-square distribution with d(d + 3)/2 degrees of freedom. ... so we select numerically α = 0.5 in our experiments. |