reproducibilityindex.ai

A Provable Decision Rule for Out-of-Distribution Detection

Authors: Xinsong Ma, Xin Zou, Weiwei Liu

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, the extensive experimental results verify the superiority of g-BH procedure over the traditional threshold-based decision rule on several OOD detection benchmarks. Extensive experiments demonstrate the superiority of the g BH procedure over the traditional threshold-based decision rule from practical perspective (focusing on TPR, FPR and F1 -score) and classical perspective (focusing on FPR95, AUROC and AUPR).
Researcher Affiliation	Academia	School of Computer Science, National Engineering Research Center for Multimedia Software, Institute of Artificial Intelligence and Hubei Key Laboratory of Multimedia and Network Communication Engineering, Wuhan University, Wuhan, China.
Pseudocode	Yes	Algorithm 1 g-BH 1: Input: Training set T , calibrated set T cal = {Xcal 1 , Xcal 2 , . . . , Xcal m } testing set T test = {Xtest 1 , Xtest 2 , . . . , Xtest n }, prescribed level α (0, 1). 2: Train the score function ˆs(x) on T . 3: Calculate the p-values corresponding to Xtest i : pi = p(Xtest i ) = \|{j [m] : ˆs(Xcal j ) ˆs(Xtest i )}\| + 1 m + 1 . 4: Compute i g BH = max{i [n] : f(p(i)) i nα}. 5: Output: Declare that Xtest (i) is OOD if i i g BH, and the rests are ID.
Open Source Code	No	The paper states 'our codes are based on Zhang et al. (2023b)' but does not explicitly provide a link or statement for the availability of the code developed in this paper.
Open Datasets	Yes	We use CIFAR-10 (Krizhevsky et al., 2009) as ID data, and use CIFAR-100, Tiny Image Net (Krizhevsky et al., 2017), SVHN (Netzer et al., 2011), Texture (Kylberg, 2011), Places365 (Zhou et al., 2018) and MNIST (Deng, 2012), as OOD data.
Dataset Splits	No	The paper mentions using an 'ID validation set' and 'calibrated set' but does not provide specific split percentages or sample counts for training, validation, and test sets.
Hardware Specification	No	The paper does not provide specific details about the hardware used (e.g., GPU models, CPU, memory) for running experiments.
Software Dependencies	No	The paper states 'our codes are based on Zhang et al. (2023b)' but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	No	The paper mentions 'More details of the experimental settings can be found in Zhang et al. (2023b)' but does not provide specific hyperparameter values or detailed training configurations within its own text.