Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Consistent Polyhedral Surrogates for Top-k Classification and Variants

Authors: Anish Thilagar, Rafael Frongillo, Jessica J Finocchiaro, Emma Goodwill

ICML 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we evaluate the performance of our surrogate compared to these previous surrogates ( 5). and 5. Numerical Comparison We have seen that Lk is consistent for top-k classiﬁcation, while L(2), L(3), and L(4) are not. In general, therefore, we expect these inconsistent losses to have worse top-k performance than Lk. We now quantify this gap for the case n = 5 and k = 3, by computing the expected difference in top-k loss obtained as a result of optimizing each of the four surrogates.
Researcher Affiliation	Academia	1University of Colorado Boulder Department of Computer Science, Boulder, CO, USA.
Pseudocode	No	No section or figure labeled 'Pseudocode' or 'Algorithm' was found, nor were any structured code-like blocks present.
Open Source Code	No	The paper states 'We also thank Forest Yang and Sanmi Koyejo for providing implementations of previously studied surrogates,' but does not indicate that the authors' own source code for the described methodology is publicly available.
Open Datasets	No	For each value of α, we sample 10000 conditional label distributions pi Dirichlet(αp); we take the feature vector xi = pi and draw the label yi pi.
Dataset Splits	No	The paper describes generating a training set of 10000 samples and a test set of 1000 samples, but does not explicitly mention a separate validation set.
Hardware Specification	No	No specific hardware details, such as GPU or CPU models, or cloud computing instance types, were mentioned for running the experiments.
Software Dependencies	No	The paper mentions 'Adam' as the optimizer but does not provide version numbers for any software libraries, frameworks, or specific tools used.
Experiment Setup	Yes	For each dataset and each surrogate loss function, we train a linear model for 200 epochs using Adam with a learning rate of 0.01.