MultiOOD: Scaling Out-of-Distribution Detection for Multiple Modalities

Authors: Hao Dong, Yue Zhao, Eleni Chatzi, Olga Fink

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on Multi OOD demonstrate that training with A2D and NP-Mix improves existing OOD detection algorithms by a large margin. To support accessibility and reproducibility, our source code and Multi OOD benchmark are available at https://github.com/donghao51/Multi OOD. Extensive experiments on the Multi OOD benchmark demonstrate the superiority of A2D and NP-Mix. We evaluate the performance via the use of the following metrics: (1) the false positive rate (FPR95) of OOD samples when the true positive rate of ID samples is at 95%, (2) the area under the receiver operating characteristic curve (AUROC), and (3) ID classification accuracy (ID ACC).
Researcher Affiliation Academia 1ETH Zürich 2University of Southern California 3EPFL
Pseudocode No The paper describes methods like A2D and NP-Mix using textual descriptions and mathematical equations, but it does not include explicitly labeled "Pseudocode" or "Algorithm" blocks or figures.
Open Source Code Yes To support accessibility and reproducibility, our source code and Multi OOD benchmark are available at https://github.com/donghao51/Multi OOD.
Open Datasets Yes Multi OOD comprises five action recognition datasets (EPIC-Kitchens [48], HAC [20], HMDB51 [39], UCF101 [57], and Kinetics-600 [6]) with over 85, 000 video clips in total.
Dataset Splits Yes We train the network for 50 epochs on an RTX 3090 GPU and select the model with the best performance on the validation dataset. EPIC-Kitchens 4/4 is derived from the EPIC-Kitchens Domain Adaptation dataset [48], where the dataset is partitioned into four classes for training as ID and four classes for testing as OOD. Different Random Splits. We run each experiment five times using different dataset splits for Multimodal Near-OOD Detection on the HMDB51 25/26 dataset in our Multi OOD benchmark, and then calculate the mean AUROC and FPR95 to demonstrate the statistical significance of our methods.
Hardware Specification Yes We train the network for 50 epochs on an RTX 3090 GPU
Software Dependencies No We adopt the MMAction2 [11] toolkit for experiments. ... We use the Adam optimizer [36] - The paper mentions software components but does not provide specific version numbers for them (e.g., MMAction2 version, PyTorch version, Adam version as a library).
Experiment Setup Yes We use the Adam optimizer [36] with a learning rate of 0.0001 and a batch size of 16. Additionally, we set the hyperparameters as follows: γ = 0.5, mixup α = 10.0, nearest neighbor N = 2. We train the network for 50 epochs on an RTX 3090 GPU and select the model with the best performance on the validation dataset.