Diversified Outlier Exposure for Out-of-Distribution Detection via Informative Extrapolation

Authors: Jianing Zhu, Yu Geng, Jiangchao Yao, Tongliang Liu, Gang Niu, Masashi Sugiyama, Bo Han

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments and analyses have been conducted to characterize and demonstrate the effectiveness of the proposed Div OE.
Researcher Affiliation Collaboration Jianing Zhu1 Geng Yu2 Jiangchao Yao2,3 Tongliang Liu4 Gang Niu5 Masashi Sugiyama5,6 Bo Han1,5 1Hong Kong Baptist University 2CMIC, Shanghai Jiao Tong University 3Shanghai AI Laboratory 4Sydney AI Centre, The University of Sydney 5RIKEN Center for Advanced Intelligence Project 6The University of Tokyo
Pseudocode Yes Algorithm 1 Diversified Outlier Exposure (Div OE) via Informative Extrapolation
Open Source Code Yes The code is publicly available at: https://github.com/tmlr-group/Div OE.
Open Datasets Yes Following the common benchmarks used in previous work [Liu et al., 2020, Ming et al., 2022], we adopt CIFAR-10, CIFAR-100 [Krizhevsky, 2009] as our ID datasets.
Dataset Splits No The paper discusses tuning based on a 'validation set' in Appendix C.4 ('OOD detection performance on the validation set will not be further enhanced.') but does not provide explicit details of its split percentages or sample counts in the main text or appendix's experimental setup.
Hardware Specification Yes All experiments are conducted with multiple runs on NVIDIA Ge Force RTX 3090 GPUs with Python 3.7 and Py Torch 1.12.
Software Dependencies Yes All experiments are conducted with multiple runs on NVIDIA Ge Force RTX 3090 GPUs with Python 3.7 and Py Torch 1.12.
Experiment Setup Yes We conduct all major experiments on pre-trained Wide Res Net [Zagoruyko and Komodakis, 2016]with 40 depth and 2 widen factor and fix the number of fine-tuning epochs to 10, following the previous research work [Hendrycks et al., 2019, Liu et al., 2020]. The models are trained using stochastic gradient descent [Kiefer and Wolfowitz, 1952] with Nesterov momentum [Duchi et al., 2011]. We adopt Cosine Annealing [Loshchilov and Hutter, 2017] to schedule the learning rate, which begins at 0.001. We set the momentum and weight decay to be 0.9 and 10 4 respectively for all experiments. The size of the mini-batch is 128 for both ID samples and OOD samples when training and 200 for both when testing.