Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
VOS: Learning What You Don't Know by Virtual Outlier Synthesis
Authors: Xuefeng Du, Zhaoning Wang, Mu Cai, Yixuan Li
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | VOS achieves competitive performance on both object detection and image classification models, reducing the FPR95 by up to 9.36% compared to the previous best method on object detectors. Code is available at https://github.com/deeplearning-wisc/vos. |
| Researcher Affiliation | Academia | Xuefeng Du, Zhaoning Wang, Mu Cai, Yixuan Li Department of Computer Sciences University of Wisconsin Madison EMAIL |
| Pseudocode | Yes | Algorithm 1 VOS: Virtual Outlier Synthesis for OOD detection |
| Open Source Code | Yes | Code is available at https://github.com/deeplearning-wisc/vos. |
| Open Datasets | Yes | We use PASCAL VOC (Everingham et al., 2010) and Berkeley Deep Drive (BDD-100k) (Yu et al., 2020) datasets as the ID training data. For both tasks, we evaluate on two OOD datasets that contain subset of images from: MS-COCO (Lin et al., 2014) and Open Images (validation set) (Kuznetsova et al., 2020). |
| Dataset Splits | Yes | We use CIFAR-10 (Krizhevsky & Hinton, 2009) as the ID training data, with standard train/val splits. ID val dataset VOC val BDD val |
| Hardware Specification | Yes | We run all experiments with Python 3.8.5 and Py Torch 1.7.0, using NVIDIA Ge Force RTX 2080Ti GPUs. |
| Software Dependencies | Yes | We run all experiments with Python 3.8.5 and Py Torch 1.7.0, using NVIDIA Ge Force RTX 2080Ti GPUs. |
| Experiment Setup | Yes | We use the Detectron2 library (Girshick et al., 2018) and train on two backbone architectures: Res Net-50 (He et al., 2016) and Reg Net X-4.0GF (Radosavovic et al., 2020). We employ a two-layer MLP with a Re LU nonlinearity for φ in Equation 5, with hidden layer dimension of 512. For each in-distribution class, we use 1,000 samples to estimate the class-conditional Gaussians. [...] The PASCAL model is trained for a total of 18,000 iterations, and the BDD-100k model is trained for 90,000 iterations. We add the uncertainty regularizer (Equation 5) starting from 2/3 of the training. The weight β is set to 0.1. |