reproducibilityindex.ai

Calibration Bottleneck: Over-compressed Representations are Less Calibratable

Authors: Deng-Bao Wang, Min-Ling Zhang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our comparative experiments show the effectiveness of our method, which improves model calibration and also yields competitive predictive performance. We conduct experiments on several image classification datasets, and the results demonstrate that our method improves the calibrated performance without significantly compromising the predictive performance.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Southeast University, Nanjing, China 2Key Lab. of Computer Network and Information Integration (Southeast University), MOE, China. Correspondence to: Min-Ling Zhang <zhangml@seu.edu.cn>.
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks. Figure 7 is an illustration, not pseudocode.
Open Source Code	Yes	Code is available at https://github.com/dengbaowang/PLP.
Open Datasets	Yes	The experiments in main text are based on four widely used image classification datasets: SVHN (Netzer et al., 2011), CIFAR-10, CIFAR-100 (Krizhevsky, 2009) and Tiny-Image Net (Deng et al., 2009).
Dataset Splits	Yes	We split the original training dataset into training set and validation set for main training and post-hoc calibration with the following ratios: 68257/5k for SVHN, 45k/5k for CIFAR-10/100 and 90k/10k for Tiny-Image Net.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions optimizers like SGD and Adam, but does not provide specific software names with version numbers (e.g., 'PyTorch 1.9' or 'TensorFlow 2.x') for its ancillary software dependencies.
Experiment Setup	Yes	We use SGD as the opimizer with a momentum of 0.9 and a weight decay of 10 4 unless otherwise specified. We train on SVHN/CIFAR-10/CIFAR-100 by total 350 epochs with the initial learning rate as 0.1, and divide it by a factor of 10 after 150 epochs and 250 epochs respectively. For Tiny Image Net, we conduct training based on the open sourced pretrained models. Based on the pretrained models, we train 200 epochs with the initial learning rate as 0.01, and divide it by a factor of 2 after every 30 epochs. We set the batch size as 128 on SVHN/CIFAR-10/CIFAR-100, and 64 on Tiny-Image Net.