The Best of Both Worlds: On the Dilemma of Out-of-distribution Detection
Authors: Qingyang Zhang, Qiuxuan Feng, Joey Tianyi Zhou, Yatao Bian, Qinghua Hu, Changqing Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical studies show that our method achieves superior performance on standard benchmarks. |
| Researcher Affiliation | Collaboration | College of Intelligence and Computing, Tianjin University1 A*STAR2, Tencent AI Lab3 |
| Pseudocode | Yes | Algorithm 1: Pseudo Code of Decoupled Uncertainty Learning (DUL) Input : ID data P ID, auxiliary outliers P SEM train, classifier fθ0 pretrained on P ID. Output : finetuned classifier fθ 1 Initialize θ = θ0; 2 for each iteration do 3 Obtain ID sample (x, y) from P ID and auxiliary outlier x from P SEM train; 4 Update model parameters θ by minimizing objective defined in Eq. 12; |
| Open Source Code | Yes | Our code is available at https://github.com/Qingyang Zhang/DUL. |
| Open Datasets | Yes | Datasets. ID datasets P ID. We train the model on different ID datasets including CIFAR-10, CIFAR-100 and Image Net-200 (a subset of Image Net-1K [39] with 200 classes). Auxiliary OOD datasets P SEM train . In CIFAR experiments, we use Image Net-RC as P SEM train . Image Net-RC is a downsampled variant of the original Image Net-1K which is widely adopted in previous OOD detection works [8, 11, 17]. We also conduct experiments on the recent TIN-597 [20] as an alternative. When Image Net-200 is ID, the remaining 800 classes termed Image Net-800 are considered as P SEM train . OOD detection test sets P SEM test are a suite of diverse datasets introduced by commonly used benchmark [5]. In CIFAR experiments, we use SVHN [40], Places365 [41], Textures [42], LSUN-R, LSUN-C [43] and i SUN [44] as P SEM test . When P ID is Image Net-200, P SEM test consists of i Naturlist [45], Open-Image [46], NINCO [47] and SSB-Hard [48]. |
| Dataset Splits | Yes | It is worth noting that in the standard OOD detection setting [11, 4], the test OOD data should not have any overlapped classes or samples with training-time auxiliary OOD data P SEM train . Let YSEM test and YSEM train be the label space of P SEM test and P SEM train respectively, we have YSEM test YSEM train = . Otherwise, OOD detection would be a trivial problem. |
| Hardware Specification | Yes | We run all the experiments on one single NVIDIA GeForce RTX-3090 GPU. |
| Software Dependencies | No | The paper mentions "PyTorch implementation" and includes a Python code snippet, but does not specify exact version numbers for PyTorch or any other software libraries/frameworks used for the experiments. |
| Experiment Setup | Yes | Our settings follow the common practice [8, 11, 20, 5] in OOD detection. Here we present a brief description and more details about datasets, metrics, and implementation are in Appendix B.1 and B.2. ... We use Wide Res Net-40-10 [57] as the backbone network, which comprises 40 layers. The widen factor is set to 10. We use SGD optimizer to train all methods with dropout strategy. The dropout rate is 0.3. The momentum is set to 0.9 and weight decay is set to 0.0005. ... For DUL, α0 is set to 12. While finetuning on CIFAR10, the m ID and m OOD are set to 10 and 30 respectively. The weight λ, γ are set to 0.3 and 2. We train for 20 epochs with an initial learning rate of 0.00005, utilizing a cosine annealing strategy to adjust the learning rate. |