Boosting Out-of-distribution Detection with Typical Features

Authors: Yao Zhu, YueFeng Chen, Chuanlong Xie, Xiaodan Li, Rong Zhang, Hui Xue', Xiang Tian, bolun zheng, Yaowu Chen

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the superiority of our method on both the commonly used benchmark (CIFAR) and the more challenging high-resolution benchmark with large label space (Image Net). Notably, our approach outperforms state-of-the-art methods by up to 5.11% in the average FPR95 on the Image Net benchmark 3. Extensive experiments show that BATS establishes a state-of-the-art performance among post-hoc methods on a suite of OOD detection benchmarks.
Researcher Affiliation Collaboration Yao Zhu1,2 Yuefeng Chen2 Chuanlong Xie3 Xiaodan Li2 Rong Zhang2 Hui Xue2 Xiang Tian1,5 Bolun Zheng4,5 Yaowu Chen1,6 1Zhejiang University, 2Alibaba Group, 3Beijing Normal University, 4Hangzhou Dianzi University 5Zhejiang Provincial Key Laboratory for Network Multimedia Technologies 6 Zhejiang University Embedded System Engineering Research Center, Ministry of Education of China
Pseudocode No The paper describes the proposed method in text and equations but does not include any structured pseudocode or algorithm blocks.
Open Source Code No 3The code will be available at this https URL
Open Datasets Yes Dataset. For evaluating the large-scale OOD detection performance, we use Image Net-1k [33] as the in-distribution dataset and consider four out-of-distribution datasets, including (subsets of) the fine-grained dataset i Naturalist [34], the scene recognition datasets Places [35] and SUN [36], and the texture dataset Textures [37] with non-overlapping categories to Image Net-1k. As for the evaluation on CIFAR Benchmarks, we use the CIFAR-10 and CIFAR-100 [38] as the in-distribution datasets using the standard split with 50,000 training images and 10,000 test images. We consider four OOD datasets: SVHN [39], Tiny Image Net [40], LSUN [41] and Textures [37].
Dataset Splits No As for the evaluation on CIFAR Benchmarks, we use the CIFAR-10 and CIFAR-100 [38] as the in-distribution datasets using the standard split with 50,000 training images and 10,000 test images. This statement provides training and testing image counts but does not specify a validation split or how it's derived from the training set for reproduction.
Hardware Specification No The paper does not specify the hardware used for running experiments, such as GPU models, CPU types, or cloud computing instances.
Software Dependencies No The paper mentions 'Py Torch [45]' as the framework for pre-trained models but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes The models are trained for 200 epochs with a batch size of 128. The starting learning rate is 0.1 and decays by a factor of 10 at epochs 100 and 150.