Distance-Based Learning from Errors for Confidence Calibration
Authors: Chen Xing, Sercan Arik, Zizhao Zhang, Tomas Pfister
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On multiple datasets and DNN architectures, we demonstrate that DBLE outperforms alternative single-model confidence calibration approaches. DBLE also achieves comparable performance with computationally-expensive ensemble approaches with lower computational cost and lower number of parameters. |
| Researcher Affiliation | Collaboration | Chen Xing College of Computer Science, Nankai University Tianjin, China Sercan O. Arık Google Cloud AI Sunnyvale, CA Zizhao Zhang Google Cloud AI Sunnyvale, CA Tomas Pfister Google Cloud AI Sunnyvale, CA |
| Pseudocode | Yes | Algorithm 1: One update of DBLE. |
| Open Source Code | No | The paper does not provide any explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | MLP on MNIST (Le Cun et al., 1998), VGG-11 (Simonyan & Zisserman, 2014) on CIFAR-10 (Krizhevsky et al., 2009), Res Net-50 (He et al., 2016) on CIFAR-100 (Krizhevsky et al., 2009) and Res Net-50 on Tiny-Image Net (Deng et al., 2009). |
| Dataset Splits | No | The paper mentions tuning hyperparameters 'according to classification performance on validate set' but does not specify the exact percentages, sample counts, or methodology for the training/validation/test splits. |
| Hardware Specification | No | The paper does not provide any specific details regarding the hardware (e.g., CPU, GPU models, memory) used for conducting the experiments. |
| Software Dependencies | No | The paper mentions general techniques and components like 'stochastic gradient descent with momentum' and 'Re LU non-linearity' and 'Dropout', but does not specify any software dependencies with version numbers (e.g., Python version, specific deep learning frameworks like PyTorch or TensorFlow, or other libraries with their versions). |
| Experiment Setup | Yes | We use an initial learning rate of 0.1 and a fixed momentum coefficient of 0.9 for all methods tested. The learning rate scheduling is tuned according to classification performance on validate set. ... We fix the dropout rate as 0.5. ... At inference, we fix the number of representation sampling U in Eq. 10 as 20. ... We set the number of bins L in Eq. 11 as 15 following (Guo et al., 2017). |