reproducibilityindex.ai

Mix-n-Match : Ensemble and Compositional Methods for Uncertainty Calibration in Deep Learning

Authors: Jize Zhang, Bhavya Kailkhura, T. Yong-Jin Han

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our approaches outperform state-of-the-art solutions on both the calibration as well as the evaluation tasks in most of the experimental settings. Our codes are available at https://github.com/zhang64llnl/Mix-n-Match-Calibration.
Researcher Affiliation	Academia	Jize Zhang 1 Bhavya Kailkhura 1 T. Yong-Jin Han 1 1Lawrence Livermore National Laboratories Livermore, CA 994550. Correspondence to: Jize Zhang <zhang64@llnl.gov>.
Pseudocode	No	The paper describes procedural steps for algorithms (e.g., IRM in Section 3.3.2) but does not present them in a structured pseudocode block or algorithm figure.
Open Source Code	Yes	Our codes are available at https://github.com/zhang64llnl/Mix-n-Match-Calibration.
Open Datasets	Yes	We calibrate various deep neural network classiﬁers on popular computer vision datasets: CIFAR-10/100 (Krizhevsky, 2009) with 10/100 classes and Image Net (Deng et al., 2009) with 1000 classes.
Dataset Splits	Yes	We use 45000 images for training and hold out 15000 images for calibration and evaluation. For Image Net, we acquired 4 pretrained models from (Paszke et al., 2019) which were trained with 1.3 million images, and 50000 images are hold out for calibration and evaluation. We randomly split the hold-out dataset into nc = 5000, ne = 10000 for CIFAR-10/100 and nc = ne = 25000 for Image Net.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers. While it references the PyTorch paper, it does not state the PyTorch version or any other software versions used.
Experiment Setup	No	The paper states that "The training detail is described in Sec. S6.", indicating that specific experimental setup details such as hyperparameters are not present in the main text.