Intra Order-preserving Functions for Calibration of Multi-Class Neural Networks
Authors: Amir Rahimi, Amirreza Shaban, Ching-An Cheng, Richard Hartley, Byron Boots
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show the effectiveness of the proposed method across a wide range of datasets and classifiers. Our method outperforms state-of-the-art post-hoc calibration methods, namely temperature scaling and Dirichlet calibration, in several evaluation metrics for the task. ... We evaluate the performance of intra order-preserving (OP), order-invariant intra order-preserving (OI), and diagonal intra order-preserving (DIAG) families in calibrating the output of various im-age classification deep networks and compare their results with the previous post-hoc calibration techniques. ... Table 1 summarizes the results of our calibration methods and other baselines in terms of ECE and presents the average relative error with respect to the uncalibrated model. |
| Researcher Affiliation | Collaboration | Amir Rahimi ANU, ACRV amir.rahimi@anu.edu.au Amirreza Shaban Georgia Tech ashaban@uw.edu Ching-An Cheng Microsoft Research chinganc@microsoft.com Richard Hartley Google Research, ANU, ACRV richard.hartley@anu.edu.au Byron Boots University of Washington bboots@cs.washington.edu |
| Pseudocode | No | The paper includes a flow graph (Figure 3) but does not contain explicit pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper refers to the 'official implementation' of a baseline method [14] and a proposed architecture from another paper [31], but does not provide specific access to its own source code for the intra order-preserving functions. |
| Open Datasets | Yes | We use six different datasets: CIFAR-{10,100} [13], SVHN [24], CARS [12], BIRDS [32], and Image Net [4]. |
| Dataset Splits | Yes | We follow the experiment protocol in [14, 16] and use cross validation on the calibration dataset to find the best hyperparameters and architectures for all the methods. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models used for running its experiments. |
| Software Dependencies | No | The paper uses deep learning frameworks implicitly but does not provide specific software dependencies with version numbers. |
| Experiment Setup | No | The paper states that 'cross validation on the calibration dataset to find the best hyperparameters and architectures' was used and mentions the negative log likelihood (NLL) loss with a regularization weight λ, but it does not provide specific values for these hyperparameters or other training configurations like learning rate or batch size. |