Mix-n-Match : Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning
Authors: Jize Zhang, Bhavya Kailkhura, T. Yong-Jin Han
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks in most of the experimental settings. Our codes are available at https://github.com/zhang64llnl/Mix-n-Match-Calibration. |
| Researcher Affiliation | Academia | Jize Zhang 1 Bhavya Kailkhura 1 T. Yong-Jin Han 1 1Lawrence Livermore National Laboratories Livermore, CA 994550. Correspondence to: Jize Zhang <zhang64@llnl.gov>. |
| Pseudocode | No | The paper describes procedural steps for algorithms (e.g., IRM in Section 3.3.2) but does not present them in a structured pseudocode block or algorithm figure. |
| Open Source Code | Yes | Our codes are available at https://github.com/zhang64llnl/Mix-n-Match-Calibration. |
| Open Datasets | Yes | We calibrate various deep neural network classifiers on popular computer vision datasets: CIFAR-10/100 (Krizhevsky, 2009) with 10/100 classes and Image Net (Deng et al., 2009) with 1000 classes. |
| Dataset Splits | Yes | We use 45000 images for training and hold out 15000 images for calibration and evaluation. For Image Net, we acquired 4 pretrained models from (Paszke et al., 2019) which were trained with 1.3 million images, and 50000 images are hold out for calibration and evaluation. We randomly split the hold-out dataset into nc = 5000, ne = 10000 for CIFAR-10/100 and nc = ne = 25000 for Image Net. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. While it references the PyTorch paper, it does not state the PyTorch version or any other software versions used. |
| Experiment Setup | No | The paper states that "The training detail is described in Sec. S6.", indicating that specific experimental setup details such as hyperparameters are not present in the main text. |