reproducibilityindex.ai

Post-hoc estimators for learning to defer to an expert

Authors: Harikrishna Narasimhan, Wittawat Jitkrittum, Aditya K. Menon, Ankit Rawat, Sanjiv Kumar

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Experimental Results. We now present empirical results illustrating the eﬃcacy of both our proposed posthoc estimators.
Researcher Affiliation	Industry	Harikrishna Narasimhan Google Research, Mountain View hnarasimhan@google.com Wittawat Jitkrittum Google Research, New York wittawat@google.com Aditya Krishna Menon Google Research, New York adityakmenon@google.com Ankit Singh Rawat Google Research, New York ankitsrawat@google.com Sanjiv Kumar Google Research, New York sanjivk@google.com
Pseudocode	No	The paper includes figures (Figure 2, Figure 3) that are flowcharts summarizing procedures but do not contain pseudocode or algorithm blocks.
Open Source Code	No	3. If you ran experiments... (a) Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No]
Open Datasets	Yes	We now report results on the CIFAR-10, CIFAR-100 [16], and Image Net [10] datasets.
Dataset Splits	No	For each c0, we train the base model using the CSS loss (4), and report the resulting accuracy. For c0 > 0, the base model exhibits underﬁtting, evidenced by signiﬁcant degradation in the training accuracy. In 3.1, we trace this behaviour to the loss applying a high level of label smoothing [31] to incorrect labels. Consequently, the entropy of the base model probabilities steadily increase with c0 (right panel).
Hardware Specification	No	3. If you ran experiments... (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [No] The newly proposed methods in this paper involve simple post-hoc operations over existing models. Thus, they do not add signiﬁcant computational overhead.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers.
Experiment Setup	Yes	On CIFAR-100, we consider a learning to defer setting comprising a Res Net-8 base model and a Res Net-32 expert hexp. We assume a cost cexp(x, y) = c0 + 1(y 6= hexp(x)) of deferring to the expert (see 2 for details on notation), where the ﬁxed cost c0 is varied from [0, 1]. For each c0, we train the base model using the CSS loss (4), and report the resulting accuracy.