Realizable $H$-Consistent and Bayes-Consistent Loss Functions for Learning to Defer
Authors: Anqi Mao, Mehryar Mohri, Yutao Zhong
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we empirically evaluate our proposed surrogate losses and compare them with existing baselines. In this section, we empirically evaluate our proposed surrogate losses and compare them with existing baselines. |
| Researcher Affiliation | Collaboration | Anqi Mao Courant Institute New York, NY 10012 aqmao@cims.nyu.edu Mehryar Mohri Google Research & CIMS New York, NY 10011 mohri@google.com Yutao Zhong Courant Institute New York, NY 10012 yutao@cims.nyu.edu |
| Pseudocode | No | The paper does not contain pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link indicating that the source code for the methodology described is open-source or publicly available. |
| Open Datasets | Yes | We follow the setting of Mozannar et al. [2023] and conduct experiments on a synthetic dataset: Mixture-of-Gaussians [Mozannar et al., 2023], and three real-world datasets: CIFAR-10H [Battleday et al., 2020], Hate Speech [Davidson et al., 2017], and COMPASS [Dressel and Farid, 2018]. |
| Dataset Splits | Yes | Each dataset is randomly split into 70%, 10%, and 20% for training, validation, and testing, respectively. |
| Hardware Specification | No | The paper does not specify the exact hardware (e.g., GPU model, CPU type) used for running the experiments. |
| Software Dependencies | No | The paper mentions using specific software packages or frameworks, but it does not provide specific version numbers for these dependencies (e.g., 'PyTorch 1.9' or 'Python 3.8'). |
| Experiment Setup | No | The paper states 'We use the same optimizer, learning rate, and number of epochs as chosen in [Mozannar et al., 2023]', which refers to an external source for hyperparameters rather than listing them explicitly in the main text of this paper. |