SIREN: Shaping Representations for Detecting Out-of-Distribution Objects

Authors: Xuefeng Du, Gabriel Gozum, Yifei Ming, Yixuan Li

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we validate the effectiveness of SIREN on object detection models, including the latest transformer-based (Section 4.1) and flagship CNN-based models (Section 4.2). For evaluating the OOD detection performance, we report: (1) the false positive rate (FPR95) of OOD objects when the true positive rate of ID samples is at 95%; (2) the area under the receiver operating characteristic curve (AUROC). For evaluating the object detection performance on the ID task, we report the common metric m AP.
Researcher Affiliation Academia Xuefeng Du, Gabriel Gozum, Yifei Ming, Yixuan Li Department of Computer Sciences University of Wisconsin-Madison {xfdu,gozum,alvinming,sharonli}@cs.wisc.edu
Pseudocode Yes Algorithm 1 SIREN: Shaping Representations for object-level OOD detection
Open Source Code Yes Code is publicly available at https://github.com/deeplearning-wisc/siren.
Open Datasets Yes Datasets. Following [13], we use PASCAL-VOC1 [14] and Berkeley Deep Drive (BDD100K)2 [73] datasets as the ID training data.
Dataset Splits No The paper uses PASCAL-VOC and BDD100K as ID training data and MS-COCO/OPENIMAGES for OOD evaluation, but it does not explicitly state the specific training, validation, and test splits (e.g., percentages or sample counts) for its primary ID datasets in the main text.
Hardware Specification No The paper defers details about the amount of compute and type of resources used to Appendix G, and no specific hardware specifications (e.g., GPU models, CPU types) are mentioned in the main text.
Software Dependencies No The paper mentions models and frameworks like DEFORMABLE-DETR [75], DETR [4], and ResNet-50 [21], but it does not provide specific version numbers for any software dependencies or libraries (e.g., Python version, deep learning framework versions).
Experiment Setup Yes For the projection head, we use a two-layer MLP with a Re LU nonlinearity, with dimensionality 256 d d. The dimension d of the unit hypersphere is 16 for PASCAL-VOC and 64 for BDD100K. The default weight β for the SIREN is 1.5 and the prototype update factor α is 0.95. We initialize the learnable κ to be 10 for all classes. The k in the KNN distance is set to 10.