In or Out? Fixing ImageNet Out-of-Distribution Detection Evaluation

Authors: Julian Bitterwolf, Maximilian Müller, Matthias Hein

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide detailed evaluations across a large set of architectures and OOD detection methods on NINCO and the unit-tests, revealing new insights about model weaknesses and the effects of pretraining on OOD detection performance.
Researcher Affiliation Academia 1University of TÈubingen and TÈubingen AI Center. Correspondence to: Julian Bitterwolf <julian.bitterwolf@uni-tuebingen.de>.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes We provide code and data at https://github.com/j-cb/NINCO.
Open Datasets Yes We provide code and data at https://github.com/j-cb/NINCO.
Dataset Splits No The paper discusses concepts like setting thresholds based on true positive rates and methods that use 'train set' for computing statistics. However, it does not explicitly provide specific percentages or counts for training, validation, or test dataset splits used in their own experiments.
Hardware Specification No The paper does not explicitly describe the hardware used to run its experiments, such as specific GPU or CPU models.
Software Dependencies Yes All model implementation and model weights were taken from the publicly available timm-repository (Wightman, 2019)... for the Vi Ts finetuned from CLIP and the Vi T without pretraining we used the timm-version 0.8.0dev0, for all other models version 0.6.12.
Experiment Setup Yes As suggested in (Sun et al., 2022), we use K = 1000. ... As suggested in (Wang et al., 2022a), we set the threshold r such that 1% of the activations from the train set would be truncated. ... Like suggested in (Wang et al., 2022a), we use D = 1000 if the dimensionality of the feature space d is d 2048, D = 512 if 2048 d 768, and D = d/2 rounded to integers otherwise.