Beyond In-Domain Scenarios: Robust Density-Aware Calibration
Authors: Christian Tomani, Futa Kai Waseda, Yuesong Shen, Daniel Cremers
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that DAC leads to consistently better calibration across a large number of model architectures, datasets, and metrics. Additionally, we show that DAC improves calibration substantially on recent large-scale neural networks pre-trained on vast amounts of data. |
| Researcher Affiliation | Academia | 1Technical University of Munich 2Munich Center for Machine Learning 3The University of Tokyo. |
| Pseudocode | No | The paper describes the proposed method using mathematical equations and textual explanations, but it does not include a formal pseudocode block or algorithm steps. |
| Open Source Code | Yes | 1Source code available at: https://github.com/ futakw/Density Aware Calibration |
| Open Datasets | Yes | We consider 3 different datasets to evaluate our models on CIFAR10/100 (Krizhevsky et al., 2009), and Image Net-1k (Deng et al., 2009). |
| Dataset Splits | Yes | We split each dataset into train, validation, and test set. ... For CIFAR10 and CIFAR100 (Krizhevsky et al., 2009), following Guo et al. (2017), we split the original train set, which contains 50,000 image-label pairs, into 45,000 image-label pairs of train set and 5,000 image-label pairs of the validation set. For Image Net (Deng et al., 2009)we split the original validation set, which contains 50,000 image-label pairs, into 12,500 image-label pairs of the validation set and 37,500 image-label pairs of the test set. (Table 5. The numbers of image-label pairs we used for each dataset. DATASET TRAIN VAL TEST CIFAR10 45,000 5,000 10,000 CIFAR100 45,000 5,000 10,000 IMAGENET 1,281,167 12,500 37,500) |
| Hardware Specification | Yes | We first compare the training speed of DAC, DIA, ETS, and SPL for Dense Net169 trained on Image Net, using an NVIDIA Titan X (12GB) GPU. |
| Software Dependencies | No | The paper mentions software like 'Py Torch’s official implementation', 'Faiss library', and 'Scipy optimization' but does not provide specific version numbers for these, which are required for full reproducibility. |
| Experiment Setup | Yes | We trained all models for 200 epochs: 100 epochs at a learning rate of 0.01, 50 epochs at a learning rate of 0.005, 30 epochs at a learning rate of 0.001, and 20 epochs at a learning rate of 0.0001. We use a basic data augmentation technique of random cropping and horizontal flipping. ... PTS was trained as a neural network with 2 fully connected hidden layers with 5 nodes each, using a learning rate of 0.00005, batch size of 1000, and step size of 100,000. |