Class Probability Matching with Calibrated Networks for Label Shift Adaption
Authors: Hongwei Wen, Annika Betken, Hanyuan Hang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | From the experimental perspective, real data comparisons show that CPMCN outperforms existing matching-based and EM-based algorithms. |
| Researcher Affiliation | Academia | Faculty of Electrical Engineering, Mathematics and Computer Science University of Twente Enschede, The Netherlands |
| Pseudocode | Yes | Algorithm 1 Class Probability Matching with Calibrated Networks (CPMCN) |
| Open Source Code | Yes | The training code and parameters can be found in the obtaining predictions folder at https://github.com/kundajelab/labelshiftexperiments/tree/ master/notebooks/obtaining_predictions. The calibrated method Bias-Corrected Temperature Scaling (BCTS) is implemented based on the code in https://github.com/kundajelab/ abstention/blob/master/abstention/calibration.py. |
| Open Datasets | Yes | In this section, we present experimental results on three benchmark datasets (MNIST Le Cun et al. (2010), CIFAR10 Krizhevsky et al. (2009), and CIFAR100 Krizhevsky et al. (2009)) |
| Dataset Splits | Yes | We take the training set of the benchmark datasets as the data for the source domain Dp and reserve 10000 samples out of Dp as a hold-out validation set, which is used to tune the hyper-parameters of the calibrated BCTS model (Alexandari et al., 2020). |
| Hardware Specification | No | The paper does not specify the exact hardware used for running the experiments (e.g., specific GPU/CPU models, cloud instances with specs). |
| Software Dependencies | No | The paper mentions 'Python' and 'L-BFGS-B (Zhu et al., 1997)' as an optimization method, but does not provide specific version numbers for Python or other key software libraries/frameworks (e.g., PyTorch, TensorFlow). |
| Experiment Setup | Yes | For repeating experiments of each method, we use the code from Geifman & El-Yaniv (2017) to train ten different network models with different random seeds. For each model, we perform 10 trials, where each trial consists of a different sampling of the validation set and a different sampling of the label-shifted target domain data. The total number of repetitions is 100. |