To Trust Or Not To Trust A Classifier

Authors: Heinrich Jiang, Been Kim, Melody Guan, Maya Gupta

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically test whether trust scores can both detect examples that are incorrectly classified with high precision and be used as a signal to determine which examples are likely correctly classified. We perform this evaluation across (i) different datasets (Sections 5.1 and 5.3), (ii) different families of classifiers (neural network, random forest and logistic regression) (Section 5.1), (iii) classifiers with varying accuracy on the same task (Section 5.2) and (iv) different representations of the data e.g. input data or activations of various intermediate layers in neural network (Section 5.3).
Researcher Affiliation Collaboration Heinrich Jiang Google Research heinrichj@google.com Been Kim Google Brain beenkim@google.com Melody Y. Guan Stanford University mguan@stanford.edu Maya Gupta Google Research mayagupta@google.com
Pseudocode Yes Algorithm 1 Estimating α-high-density-set Parameters: α (density threshold), k. Inputs: Sample points X := {x1, .., xn} drawn from f. Define k-NN radius rk(x) := inf{r > 0 : |B(x, r) X| k} and let ε := inf{r > 0 : |{x X : rk(x) > r}| α n}. return c Hα(f) := {x X : rk(x) ε}. Algorithm 2 Trust Score Parameters: α (density threshold), k. Input: Classifier h : X Y. Training data (x1, y1), ..., (xn, yn). Test example x. For each ℓ Y, let c Hα(fℓ) be the output of Algorithm 1 with parameters α, k and sample points {xj : 1 j n, yj = ℓ}. Then, return the trust score, defined as: ξ(h, x) := d x, c Hα(feh(x)) /d x, c Hα(fh(x)) , where eh(x) = argminl Y,l =h(x) d x, c Hα(fl) .
Open Source Code Yes An open-source implementation of Trust Scores can be found here: https://github.com/google/Trust Score
Open Datasets Yes The MNIST handwritten digit dataset [48] consists of 60,000 28 28-pixel training images and 10,000 testing images in 10 classes. The SVHN dataset [49] consists of 73,257 32 32-pixel colour training images and 26,032 testing images and also has 10 classes. The CIFAR-10 and CIFAR-100 datasets [50] both consist of 60,000 32 32-pixel colour images, with 50,000 training images and 10,000 test images.
Dataset Splits Yes For each run we took a random stratified split of the dataset into two halves. One portion was used for training the trust score and the other was used for evaluation and the standard error is shown in addition to the average precision across the runs at each percentile level. ... The MNIST handwritten digit dataset [48] consists of 60,000 28 28-pixel training images and 10,000 testing images in 10 classes. The SVHN dataset [49] consists of 73,257 32 32-pixel colour training images and 26,032 testing images and also has 10 classes. The CIFAR-10 and CIFAR-100 datasets [50] both consist of 60,000 32 32-pixel colour images, with 50,000 training images and 10,000 test images.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or other computer specifications used for running the experiments. It only mentions using pretrained models and networks implemented in Keras.
Software Dependencies No The paper mentions Keras [53] as the framework used (François Chollet et al. Keras. https://github.com/fchollet/keras, 2015.) but does not specify a version number for Keras or any other software dependencies with version numbers.
Experiment Setup Yes Throughout our experiments, we fix k = 10, and use cross-validation to select α as it is data-dependent. ... The CIFAR-10 VGG-16 network achieves a test accuracy of 93.56% while the CIFAR-100 network achieves a test accuracy of 70.48%. We used pretrained, smaller CNNs for MNIST and SVHN. The MNIST network achieves a test accuracy of 99.07% and the SVHN network achieves a test accuracy of 95.45%. All architectures were implemented in Keras [53]. ... As input to the trust score, we tried using 1) the logit layer, 2) the preceding fully connected layer with Re LU activation, 3) this fully connected layer, which has 128 dimensions in the MNIST network and 512 dimensions in the other networks, reduced down to 20 dimensions from applying PCA. ... All plots were made using α = 0; using crossvalidation to select a different α did not improve trust score performance.