Neuron Activation Coverage: Rethinking Out-of-distribution Detection and Generalization

Authors: Yibing Liu, Chris XING TIAN, Haoliang Li, Lei Ma, Shiqi Wang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this paper, we study the OOD problem from a neuron activation view. We first formulate neuron activation states by considering both the neuron output and its influence on model decisions. Then, to characterize the relationship between neurons and OOD issues, we introduce the neuron activation coverage (NAC) a simple measure for neuron behaviors under In D data. Leveraging our NAC, we show that 1) In D and OOD inputs can be largely separated based on the neuron behavior, which significantly eases the OOD detection problem and beats the 21 previous methods over three benchmarks (CIFAR-10, CIFAR-100, and Image Net-1K). 2) a positive correlation between NAC and model generalization ability consistently holds across architectures and datasets, which enables a NAC-based criterion for evaluating model robustness. Compared to prevalent In D validation criteria, we show that NAC not only can select more robust models, but also has a stronger correlation with OOD test performance. Our code is available at: https://github.com/Bier One/ood coverage. ... Section 3 EXPERIMENTS
Researcher Affiliation Collaboration Yibing Liu1, Chris Xing Tian1, Haoliang Li1, , Lei Ma2,3, Shiqi Wang1 City University of Hong Kong1 The University of Tokyo2 University of Alberta3 lyibing112@gmail.com,xingtian4-c@my.cityu.edu.hk {haoliang.li,shiqiwang}@cityu.edu.hk,ma.lei@acm.org
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. It describes the approximation of PDF and G(X, θ) using Riemman approximation in text.
Open Source Code Yes Our code is available at: https://github.com/Bier One/ood coverage.
Open Datasets Yes We evaluate our NAC-UE on three benchmarks: CIFAR-10, CIFAR-100, and Image Net-1k. For CIFAR-10 and CIFAR-100, In D dataset corresponds to the respective CIFAR, and 4 OOD datasets are included: MNIST (Deng, 2012), SVHN (Netzer et al., 2011), Textures (Cimpoi et al., 2014), and Places365 (Zhou et al., 2018). For Image Net experiments, Image Net-1k serves as In D, along with 3 OOD datasets: i Naturalist (Horn et al., 2018), Textures (Cimpoi et al., 2014), and Open Image-O (Wang et al., 2022).
Dataset Splits Yes Specifically, for both CIFAR-10 and CIFAR-100, we utilize the official train set with 50,000 training images, and hold out 1,000 samples from the test set as In D validation set. The 1,000 images covering 20 categories are held out from Tiny Image Net (Le & Yang, 2015), serving as the OOD validation set. ... Following Open OOD, we utilize 45,000 images from the Image Net validation set as In D test set, and the remaining 5,000 samples as In D validation set. To search hyperparameters, 1,763 images from Open Image-O (Wang et al., 2022) are picked out for OOD validation.
Hardware Specification Yes All experiments are performed on a single NVIDIA Ge Force RTX 3090 GPU, with Python version 3.8.11.
Software Dependencies Yes All experiments are performed on a single NVIDIA Ge Force RTX 3090 GPU, with Python version 3.8.11. The deep learning framework used is Py Torch 1.10.0, and Torchvision version 0.11.1 is utilized for image processing. We leverage CUDA 11.3 for GPU acceleration.
Experiment Setup Yes Implementation details. We first build the NAC function using In D training data, utilizing 1,000 training images for Res Net-18 and Res Net-50, and 50,000 images for Vit-b16. ... Paramter analysis. Table 5-7 presents a systematically analysis of the effect of sigmoid steepness (α), lower bound (r) for full coverage, and the number of intervals (M) for PDF approximation. ... In Table 10-12, we list the values of selected hyperparameters for different model architectures over CIFAR-10, CIFAR-100, and Image Net benchmarks.