Disrupting Deep Uncertainty Estimation Without Harming Accuracy
Authors: Ido Galil, Ran El-Yaniv
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test the proposed attack on several contemporary architectures such as Mobile Net V2 and Efficient Net B0, all trained to classify Image Net. We evaluate the potency of ACE using several metrics: AURC ( 103), Negative Log-likelihood (NLL) and Brier score [1] commonly used for uncertainty estimation in DNNs, and defined in Appendix A. We test several architectures on the Image Net [5] validation set (which contains 50,000 images), implemented and pretrained by Py Torch [24], with the exception of Efficient Net B0, which was implemented and pretrained by [29]. |
| Researcher Affiliation | Collaboration | Ido Galil Technion idogalil.ig@gmail.com Ran El-Yaniv Technion, Deci.AI rani@cs.technion.ac.il |
| Pseudocode | Yes | Algorithm 1 Attack on Confidence Estimation |
| Open Source Code | No | The paper does not provide an explicit statement or link to the open-source code for the described methodology. A reference to 'Pytorch image models' [29] is for a third-party library, not their own implementation code. |
| Open Datasets | Yes | Our attacks are focused on the task of Image Net classification [5]. We test ACE under white-box settings attacking Selective Net using two different architectures for its backbone, namely Res Net18 and VGG16, both trained on CIFAR-10 [16]. |
| Dataset Splits | Yes | We test several architectures on the Image Net [5] validation set (which contains 50,000 images). The empirical coverage of Selective Net with Res Net18 as its backbone on the test set is 0.648, and the selective risk we show in Table 4 is the model s risk for that exact coverage. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU or CPU models, memory) used for running its experiments. It only mentions general tools like PyTorch for implementation. |
| Software Dependencies | No | The paper mentions 'implemented and pretrained by Py Torch [24]', but it does not specify a version number for PyTorch or any other key software dependencies required to reproduce the experiments. |
| Experiment Setup | Yes | In all of our experiments, we set the hyperparameters arbitrarily: ϵdecay = 0.5, max_iterations = 15. |