Consistent Polyhedral Surrogates for Top-k Classification and Variants

Authors: Anish Thilagar, Rafael Frongillo, Jessica J Finocchiaro, Emma Goodwill

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we evaluate the performance of our surrogate compared to these previous surrogates ( 5). and 5. Numerical Comparison We have seen that Lk is consistent for top-k classification, while L(2), L(3), and L(4) are not. In general, therefore, we expect these inconsistent losses to have worse top-k performance than Lk. We now quantify this gap for the case n = 5 and k = 3, by computing the expected difference in top-k loss obtained as a result of optimizing each of the four surrogates.
Researcher Affiliation Academia 1University of Colorado Boulder Department of Computer Science, Boulder, CO, USA.
Pseudocode No No section or figure labeled 'Pseudocode' or 'Algorithm' was found, nor were any structured code-like blocks present.
Open Source Code No The paper states 'We also thank Forest Yang and Sanmi Koyejo for providing implementations of previously studied surrogates,' but does not indicate that the authors' own source code for the described methodology is publicly available.
Open Datasets No For each value of α, we sample 10000 conditional label distributions pi Dirichlet(αp); we take the feature vector xi = pi and draw the label yi pi.
Dataset Splits No The paper describes generating a training set of 10000 samples and a test set of 1000 samples, but does not explicitly mention a separate validation set.
Hardware Specification No No specific hardware details, such as GPU or CPU models, or cloud computing instance types, were mentioned for running the experiments.
Software Dependencies No The paper mentions 'Adam' as the optimizer but does not provide version numbers for any software libraries, frameworks, or specific tools used.
Experiment Setup Yes For each dataset and each surrogate loss function, we train a linear model for 200 epochs using Adam with a learning rate of 0.01.