To Trust Or Not To Trust A Classifier
Authors: Heinrich Jiang, Been Kim, Melody Guan, Maya Gupta
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we empirically test whether trust scores can both detect examples that are incorrectly classified with high precision and be used as a signal to determine which examples are likely correctly classified. We perform this evaluation across (i) different datasets (Sections 5.1 and 5.3), (ii) different families of classifiers (neural network, random forest and logistic regression) (Section 5.1), (iii) classifiers with varying accuracy on the same task (Section 5.2) and (iv) different representations of the data e.g. input data or activations of various intermediate layers in neural network (Section 5.3). |
| Researcher Affiliation | Collaboration | Heinrich Jiang Google Research heinrichj@google.com Been Kim Google Brain beenkim@google.com Melody Y. Guan Stanford University mguan@stanford.edu Maya Gupta Google Research mayagupta@google.com |
| Pseudocode | Yes | Algorithm 1 Estimating α-high-density-set Parameters: α (density threshold), k. Inputs: Sample points X := {x1, .., xn} drawn from f. Define k-NN radius rk(x) := inf{r > 0 : |B(x, r) X| k} and let ε := inf{r > 0 : |{x X : rk(x) > r}| α n}. return c Hα(f) := {x X : rk(x) ε}. Algorithm 2 Trust Score Parameters: α (density threshold), k. Input: Classifier h : X Y. Training data (x1, y1), ..., (xn, yn). Test example x. For each ℓ Y, let c Hα(fℓ) be the output of Algorithm 1 with parameters α, k and sample points {xj : 1 j n, yj = ℓ}. Then, return the trust score, defined as: ξ(h, x) := d x, c Hα(feh(x)) /d x, c Hα(fh(x)) , where eh(x) = argminl Y,l =h(x) d x, c Hα(fl) . |
| Open Source Code | Yes | An open-source implementation of Trust Scores can be found here: https://github.com/google/Trust Score |
| Open Datasets | Yes | The MNIST handwritten digit dataset [48] consists of 60,000 28 28-pixel training images and 10,000 testing images in 10 classes. The SVHN dataset [49] consists of 73,257 32 32-pixel colour training images and 26,032 testing images and also has 10 classes. The CIFAR-10 and CIFAR-100 datasets [50] both consist of 60,000 32 32-pixel colour images, with 50,000 training images and 10,000 test images. |
| Dataset Splits | Yes | For each run we took a random stratified split of the dataset into two halves. One portion was used for training the trust score and the other was used for evaluation and the standard error is shown in addition to the average precision across the runs at each percentile level. ... The MNIST handwritten digit dataset [48] consists of 60,000 28 28-pixel training images and 10,000 testing images in 10 classes. The SVHN dataset [49] consists of 73,257 32 32-pixel colour training images and 26,032 testing images and also has 10 classes. The CIFAR-10 and CIFAR-100 datasets [50] both consist of 60,000 32 32-pixel colour images, with 50,000 training images and 10,000 test images. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, memory, or other computer specifications used for running the experiments. It only mentions using pretrained models and networks implemented in Keras. |
| Software Dependencies | No | The paper mentions Keras [53] as the framework used (François Chollet et al. Keras. https://github.com/fchollet/keras, 2015.) but does not specify a version number for Keras or any other software dependencies with version numbers. |
| Experiment Setup | Yes | Throughout our experiments, we fix k = 10, and use cross-validation to select α as it is data-dependent. ... The CIFAR-10 VGG-16 network achieves a test accuracy of 93.56% while the CIFAR-100 network achieves a test accuracy of 70.48%. We used pretrained, smaller CNNs for MNIST and SVHN. The MNIST network achieves a test accuracy of 99.07% and the SVHN network achieves a test accuracy of 95.45%. All architectures were implemented in Keras [53]. ... As input to the trust score, we tried using 1) the logit layer, 2) the preceding fully connected layer with Re LU activation, 3) this fully connected layer, which has 128 dimensions in the MNIST network and 512 dimensions in the other networks, reduced down to 20 dimensions from applying PCA. ... All plots were made using α = 0; using crossvalidation to select a different α did not improve trust score performance. |