The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection

Authors: Qingyang Zhang, Qiuxuan Feng, Joey Tianyi Zhou, Yatao Bian, Qinghua Hu, Changqing Zhang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical studies show that our method achieves superior performance on standard benchmarks.
Researcher Affiliation Collaboration College of Intelligence and Computing, Tianjin University1 A*STAR2, Tencent AI Lab3
Pseudocode Yes Algorithm 1: Pseudo Code of Decoupled Uncertainty Learning (DUL) Input : ID data P ID, auxiliary outliers P SEM train, classifier fθ0 pretrained on P ID. Output : finetuned classifier fθ 1 Initialize θ = θ0; 2 for each iteration do 3 Obtain ID sample (x, y) from P ID and auxiliary outlier x from P SEM train; 4 Update model parameters θ by minimizing objective defined in Eq. 12;
Open Source Code Yes Our code is available at https://github.com/Qingyang Zhang/DUL.
Open Datasets Yes Datasets. ID datasets P ID. We train the model on different ID datasets including CIFAR-10, CIFAR-100 and Image Net-200 (a subset of Image Net-1K [39] with 200 classes). Auxiliary OOD datasets P SEM train . In CIFAR experiments, we use Image Net-RC as P SEM train . Image Net-RC is a downsampled variant of the original Image Net-1K which is widely adopted in previous OOD detection works [8, 11, 17]. We also conduct experiments on the recent TIN-597 [20] as an alternative. When Image Net-200 is ID, the remaining 800 classes termed Image Net-800 are considered as P SEM train . OOD detection test sets P SEM test are a suite of diverse datasets introduced by commonly used benchmark [5]. In CIFAR experiments, we use SVHN [40], Places365 [41], Textures [42], LSUN-R, LSUN-C [43] and i SUN [44] as P SEM test . When P ID is Image Net-200, P SEM test consists of i Naturlist [45], Open-Image [46], NINCO [47] and SSB-Hard [48].
Dataset Splits Yes It is worth noting that in the standard OOD detection setting [11, 4], the test OOD data should not have any overlapped classes or samples with training-time auxiliary OOD data P SEM train . Let YSEM test and YSEM train be the label space of P SEM test and P SEM train respectively, we have YSEM test YSEM train = . Otherwise, OOD detection would be a trivial problem.
Hardware Specification Yes We run all the experiments on one single NVIDIA GeForce RTX-3090 GPU.
Software Dependencies No The paper mentions "PyTorch implementation" and includes a Python code snippet, but does not specify exact version numbers for PyTorch or any other software libraries/frameworks used for the experiments.
Experiment Setup Yes Our settings follow the common practice [8, 11, 20, 5] in OOD detection. Here we present a brief description and more details about datasets, metrics, and implementation are in Appendix B.1 and B.2. ... We use Wide Res Net-40-10 [57] as the backbone network, which comprises 40 layers. The widen factor is set to 10. We use SGD optimizer to train all methods with dropout strategy. The dropout rate is 0.3. The momentum is set to 0.9 and weight decay is set to 0.0005. ... For DUL, α0 is set to 12. While finetuning on CIFAR10, the m ID and m OOD are set to 10 and 30 respectively. The weight λ, γ are set to 0.3 and 2. We train for 20 epochs with an initial learning rate of 0.00005, utilizing a cosine annealing strategy to adjust the learning rate.