Post-hoc estimators for learning to defer to an expert

Authors: Harikrishna Narasimhan, Wittawat Jitkrittum, Aditya K. Menon, Ankit Rawat, Sanjiv Kumar

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 5 Experimental Results. We now present empirical results illustrating the efficacy of both our proposed posthoc estimators.
Researcher Affiliation Industry Harikrishna Narasimhan Google Research, Mountain View hnarasimhan@google.com Wittawat Jitkrittum Google Research, New York wittawat@google.com Aditya Krishna Menon Google Research, New York adityakmenon@google.com Ankit Singh Rawat Google Research, New York ankitsrawat@google.com Sanjiv Kumar Google Research, New York sanjivk@google.com
Pseudocode No The paper includes figures (Figure 2, Figure 3) that are flowcharts summarizing procedures but do not contain pseudocode or algorithm blocks.
Open Source Code No 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No]
Open Datasets Yes We now report results on the CIFAR-10, CIFAR-100 [16], and Image Net [10] datasets.
Dataset Splits No For each c0, we train the base model using the CSS loss (4), and report the resulting accuracy. For c0 > 0, the base model exhibits underfitting, evidenced by significant degradation in the training accuracy. In 3.1, we trace this behaviour to the loss applying a high level of label smoothing [31] to incorrect labels. Consequently, the entropy of the base model probabilities steadily increase with c0 (right panel).
Hardware Specification No 3. If you ran experiments... (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] The newly proposed methods in this paper involve simple post-hoc operations over existing models. Thus, they do not add significant computational overhead.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup Yes On CIFAR-100, we consider a learning to defer setting comprising a Res Net-8 base model and a Res Net-32 expert hexp. We assume a cost cexp(x, y) = c0 + 1(y 6= hexp(x)) of deferring to the expert (see 2 for details on notation), where the fixed cost c0 is varied from [0, 1]. For each c0, we train the base model using the CSS loss (4), and report the resulting accuracy.