reproducibilityindex.ai

Disrupting Deep Uncertainty Estimation Without Harming Accuracy

Authors: Ido Galil, Ran El-Yaniv

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We test the proposed attack on several contemporary architectures such as Mobile Net V2 and Efﬁcient Net B0, all trained to classify Image Net. We evaluate the potency of ACE using several metrics: AURC ( 103), Negative Log-likelihood (NLL) and Brier score [1] commonly used for uncertainty estimation in DNNs, and deﬁned in Appendix A. We test several architectures on the Image Net [5] validation set (which contains 50,000 images), implemented and pretrained by Py Torch [24], with the exception of Efﬁcient Net B0, which was implemented and pretrained by [29].
Researcher Affiliation	Collaboration	Ido Galil Technion idogalil.ig@gmail.com Ran El-Yaniv Technion, Deci.AI rani@cs.technion.ac.il
Pseudocode	Yes	Algorithm 1 Attack on Conﬁdence Estimation
Open Source Code	No	The paper does not provide an explicit statement or link to the open-source code for the described methodology. A reference to 'Pytorch image models' [29] is for a third-party library, not their own implementation code.
Open Datasets	Yes	Our attacks are focused on the task of Image Net classiﬁcation [5]. We test ACE under white-box settings attacking Selective Net using two different architectures for its backbone, namely Res Net18 and VGG16, both trained on CIFAR-10 [16].
Dataset Splits	Yes	We test several architectures on the Image Net [5] validation set (which contains 50,000 images). The empirical coverage of Selective Net with Res Net18 as its backbone on the test set is 0.648, and the selective risk we show in Table 4 is the model s risk for that exact coverage.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU or CPU models, memory) used for running its experiments. It only mentions general tools like PyTorch for implementation.
Software Dependencies	No	The paper mentions 'implemented and pretrained by Py Torch [24]', but it does not specify a version number for PyTorch or any other key software dependencies required to reproduce the experiments.
Experiment Setup	Yes	In all of our experiments, we set the hyperparameters arbitrarily: ϵdecay = 0.5, max_iterations = 15.