A Provable Decision Rule for Out-of-Distribution Detection
Authors: Xinsong Ma, Xin Zou, Weiwei Liu
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, the extensive experimental results verify the superiority of g-BH procedure over the traditional threshold-based decision rule on several OOD detection benchmarks. Extensive experiments demonstrate the superiority of the g BH procedure over the traditional threshold-based decision rule from practical perspective (focusing on TPR, FPR and F1 -score) and classical perspective (focusing on FPR95, AUROC and AUPR). |
| Researcher Affiliation | Academia | School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence and Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan, China. |
| Pseudocode | Yes | Algorithm 1 g-BH 1: Input: Training set T , calibrated set T cal = {Xcal 1 , Xcal 2 , . . . , Xcal m } testing set T test = {Xtest 1 , Xtest 2 , . . . , Xtest n }, prescribed level α (0, 1). 2: Train the score function ˆs(x) on T . 3: Calculate the p-values corresponding to Xtest i : pi = p(Xtest i ) = |{j [m] : ˆs(Xcal j ) ˆs(Xtest i )}| + 1 m + 1 . 4: Compute i g BH = max{i [n] : f(p(i)) i nα}. 5: Output: Declare that Xtest (i) is OOD if i i g BH, and the rests are ID. |
| Open Source Code | No | The paper states 'our codes are based on Zhang et al. (2023b)' but does not explicitly provide a link or statement for the availability of the code developed in this paper. |
| Open Datasets | Yes | We use CIFAR-10 (Krizhevsky et al., 2009) as ID data, and use CIFAR-100, Tiny Image Net (Krizhevsky et al., 2017), SVHN (Netzer et al., 2011), Texture (Kylberg, 2011), Places365 (Zhou et al., 2018) and MNIST (Deng, 2012), as OOD data. |
| Dataset Splits | No | The paper mentions using an 'ID validation set' and 'calibrated set' but does not provide specific split percentages or sample counts for training, validation, and test sets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used (e.g., GPU models, CPU, memory) for running experiments. |
| Software Dependencies | No | The paper states 'our codes are based on Zhang et al. (2023b)' but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | No | The paper mentions 'More details of the experimental settings can be found in Zhang et al. (2023b)' but does not provide specific hyperparameter values or detailed training configurations within its own text. |