reproducibilityindex.ai

Constraining Representations Yields Models That Know What They Don't Know

Authors: Joao Monteiro, Pau Rodriguez, Pierre-Andre Noel, Issam H. Laradji, David Vazquez

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide further empirical evidence that TAC works well on multiple types of architectures and data modalities and that it is at least as good as state-of-the-art alternative confidence scores derived from existing models. [...] Evaluations are split into three main parts: Section 3.1: We start with a proof-of-concept and show that TAC can match activation patterns defined by class codes. We further show that small norm attackers are not able to match codes as well as clean data, rendering the distance between activation profiles and codes a good confidence score. Section 3.2: We then proceed to the main evaluation and use TAC as an add-on to existing classifiers. In this case, we evaluate performance under the rejection setting and show TAC to improve upon the base classifier. We further evaluate TAC when used to detect test data from unseen classes. Section 3.3: We seek additional applications of TAC and put it to test as a robust surrogate to the base classifier.
Researcher Affiliation	Industry	Jo ao Monteiro, Pau Rodr ıguez , Pierre-Andr e No el, Issam Laradji, David V azquez Service Now Research {First Name.Last Name}@servicenow.com Currently at Apple.
Pseudocode	Yes	Figure 16 in the Appendix shows a Pytorch (Paszke et al., 2019) implementation of feature slicing and activation profile computation. [...] We present Python code snippets in Figure 15 showing that our choice of threshold used to compute the detection rate matches that of the standard Equal Error Rate. [...] In Figure 16, we show an example of an implementation of TAC s slice and reduce operations on top of 2-dimensional features. [...] Figure 17: Pytorch implementation of Mixup interpolations.
Open Source Code	No	The paper mentions 'code snippets of critical components are displayed in Figures 16 and 17' in the reproducibility statement, but it does not state that the full source code for the methodology is openly available or provide a link to a repository.
Open Datasets	Yes	To test for whether commonly used models are able to match activation patterns given by class codes, we train a TAC ed Wide Res Net-28-10 (Madry et al., 2017) on CIFAR-10 (Krizhevsky et al., 2009)... [...] We consider intent prediction tasks of Dialo GLUE (Mehri et al., 2020). Namely, we conduct experiments on HWU64 (Liu et al., 2021), Banking77 (Casanueva et al., 2020), and CLINC150 (Larson et al., 2019)... [...] We pre-trained a Vi T Base-16x16 (Dosovitskiy et al., 2020) as the base predictor, and TAC operations are performed in 13 different layers throughout the model. We perform k-fold (k = 5) random splits on the validation set of Image Net and, for a given split and value of ω, we then use the k 1 left-out splits to select the confidence rejection threshold that maximizes V.
Dataset Splits	Yes	We perform k-fold (k = 5) random splits on the validation set of Image Net and, for a given split and value of ω, we then use the k 1 left-out splits to select the confidence rejection threshold that maximizes V. Curves averaged over splits are plotted for the data used for threshold selection (indicated as train in the plot) as well as for the left-out splits.
Hardware Specification	No	The paper states 'both training and evaluation across all applications we considered were performed in single-GPU hardware.' and 'it takes only a couple of hours to train TAC on a single GPU.' However, no specific GPU model (e.g., NVIDIA A100, RTX 3090, etc.) or any other detailed hardware specifications are provided.
Software Dependencies	No	The paper mentions 'Pytorch (Paszke et al., 2019)', 'Foolbox', and 'Torchvision' as tools used. While PyTorch is cited, specific version numbers for any of these software dependencies (e.g., 'Pytorch 1.x' or 'Foolbox vX.Y') are not explicitly provided in the text, which is required for reproducibility.
Experiment Setup	Yes	Training was performed with Adam in all cases except for models trained on MNIST and CIFAR-10, where SGD with momentum was employed. [...] Overall, we noticed that TAC tends to perform better when weight decay is not applied or when its coefficient is set to very small values (< 10 5). [...] The perturbation budget given to attackers in each case was 0.05, 0.02, and 0.1 for FGSM, PGD, and CW respectively. [...] We considered 5 projection configurations named small, large, very-large, x-large, and 2x-large, and the choice amongst those options is treated as a hyperparameter to be selected with cross-validation for each dataset we trained on. The numbers of fully connected layers for each configuration is 1, 2, 3, 33, and 5.