Diffusion-based Layer-wise Semantic Reconstruction for Unsupervised Out-of-Distribution Detection
Authors: Ying Yang, De Cheng, Chaowei Fang, Yubiao Wang, Changzhe Jiao, Lechao Cheng, Nannan Wang, Xinbo Gao
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results on multiple benchmarks built upon various datasets demonstrate that our method achieves state-of-the-art performance in terms of detection accuracy and speed. |
| Researcher Affiliation | Academia | 1Xidian University 2Hefei University of Technology 3Chongqing University of Posts and Telecommunications |
| Pseudocode | Yes | Algorithm 1 Training Algorithm; Algorithm 2 Testing Algorithm; Algorithm 3 Testing Algorithm for MSE Calculation; Algorithm 4 Testing Algorithm for LR Calculation |
| Open Source Code | Yes | Code is available at https://github.com/xbyym/DLSR. |
| Open Datasets | Yes | We train the OOD detection model on three in-distribution (ID) datasets: CIFAR-10 [Krizhevsky et al., 2009], CIFAR-100, and Celeb A [Liu et al., 2015]. |
| Dataset Splits | No | The paper lists the ID datasets used for training (CIFAR-10, CIFAR-100, Celeb A) and OOD datasets for testing, but does not explicitly provide percentages or sample counts for training, validation, or test splits of the ID datasets used for their model's training. |
| Hardware Specification | Yes | Our method is trained on NVIDIA Geforce 4090 GPU for 150 epochs, with a batch size of 128 and a constant learning rate of 10 4 throughout the training phase. |
| Software Dependencies | No | The paper mentions specific components like "Efficient Net-b4", "Res Net50", and "Adam W optimizer", but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | For Efficient Net-b4, we select feature maps from the first to fifth stages (M = 5) to construct the multi-layer semantic features, resulting in a feature dimension (c) of 720. The LFDN is consisting of 16 residual blocks. Inside each residual block, the number of groups in Groupnorm and the intermediate feature dimension of the residual branch are set to 1 and 1440, respectively. We employ the Adam W optimizer with a weight decay of 10 4. Our method is trained on NVIDIA Geforce 4090 GPU for 150 epochs, with a batch size of 128 and a constant learning rate of 10 4 throughout the training phase. |