Post-hoc estimators for learning to defer to an expert
Authors: Harikrishna Narasimhan, Wittawat Jitkrittum, Aditya K. Menon, Ankit Rawat, Sanjiv Kumar
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experimental Results. We now present empirical results illustrating the efficacy of both our proposed posthoc estimators. |
| Researcher Affiliation | Industry | Harikrishna Narasimhan Google Research, Mountain View hnarasimhan@google.com Wittawat Jitkrittum Google Research, New York wittawat@google.com Aditya Krishna Menon Google Research, New York adityakmenon@google.com Ankit Singh Rawat Google Research, New York ankitsrawat@google.com Sanjiv Kumar Google Research, New York sanjivk@google.com |
| Pseudocode | No | The paper includes figures (Figure 2, Figure 3) that are flowcharts summarizing procedures but do not contain pseudocode or algorithm blocks. |
| Open Source Code | No | 3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No] |
| Open Datasets | Yes | We now report results on the CIFAR-10, CIFAR-100 [16], and Image Net [10] datasets. |
| Dataset Splits | No | For each c0, we train the base model using the CSS loss (4), and report the resulting accuracy. For c0 > 0, the base model exhibits underfitting, evidenced by significant degradation in the training accuracy. In 3.1, we trace this behaviour to the loss applying a high level of label smoothing [31] to incorrect labels. Consequently, the entropy of the base model probabilities steadily increase with c0 (right panel). |
| Hardware Specification | No | 3. If you ran experiments... (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] The newly proposed methods in this paper involve simple post-hoc operations over existing models. Thus, they do not add significant computational overhead. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | On CIFAR-100, we consider a learning to defer setting comprising a Res Net-8 base model and a Res Net-32 expert hexp. We assume a cost cexp(x, y) = c0 + 1(y 6= hexp(x)) of deferring to the expert (see 2 for details on notation), where the fixed cost c0 is varied from [0, 1]. For each c0, we train the base model using the CSS loss (4), and report the resulting accuracy. |