Counterfactually Comparing Abstaining Classifiers
Authors: Yo Joong Choe, Aditya Gangrade, Aaditya Ramdas
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our approach is examined in both simulated and real data experiments. ... We present our results in Table 2. ... To illustrate a real data use case, we compare abstaining classifiers on the CIFAR-100 image classification dataset (Krizhevsky, 2009). |
| Researcher Affiliation | Academia | Yo Joong Choe Data Science Institute University of Chicago yjchoe@uchicago.edu Aditya Gangrade Department of EECS University of Michigan aditg@umich.edu Aaditya Ramdas Dept. of Statistics and Data Science Machine Learning Department Carnegie Mellon University arimdas@cmu.edu |
| Pseudocode | No | The paper describes the methods textually and mathematically but does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | All code for the experiments is publicly available online at https://github.com/yjchoe/Comparing Abstaining Classifiers. |
| Open Datasets | Yes | To illustrate a real data use case, we compare abstaining classifiers on the CIFAR-100 image classification dataset (Krizhevsky, 2009). |
| Dataset Splits | No | The paper mentions using a 'validation set' for CIFAR-100 but does not provide specific split percentages or sample counts for training, validation, or test sets in the main text. |
| Hardware Specification | No | The paper mentions using 'XSEDE' and the 'Bridges-2 system' at the 'Pittsburgh Supercomputing Center (PSC)', but it does not specify any particular hardware components like CPU or GPU models, or their specifications. |
| Software Dependencies | No | The paper does not mention any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | For the nuisance functions, we try linear predictors (L2-regularized linear/logistic regression for ˆµ0/ˆπ), random forests, and super learners with k-NN, kernel SVM, and random forests. ... use the same softmax output layer but use a different threshold for abstentions. Specifically, both classifiers use the softmax response (SR) thresholding (Geifman and El-Yaniv, 2017), i.e., abstain if maxc Y f(X)c < τ for a threshold τ > 0, but A uses a more conservative threshold (τ = 0.8) than B (τ = 0.5). |