Out of Distribution Data Detection Using Dropout Bayesian Neural Networks

Authors: Andre T. Nguyen, Fred Lu, Gary Lopez Munoz, Edward Raff, Charles Nicholas, James Holt7877-7885

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate the value of randomized embedding based features across three different OOD data detection tasks in the vision, language, and malware domains. All experiments were implemented in Py Torch (Paszke et al. 2019), and neural networks were optimized using Adam with the default recommended settings (Kingma and Ba 2015).
Researcher Affiliation Collaboration Andre T. Nguyen,1,2,3 Fred Lu,1,2,3 Gary Lopez Munoz,1,2 Edward Raff,1,2,3 Charles Nicholas,3 James Holt1 1Laboratory for Physical Sciences 2Booz Allen Hamilton 3University of Maryland, Baltimore County andre@lps.umd.edu, lu_fred@bah.com, dlmgary@lps.umd.edu, edraff@lps.umd.edu, nicholas@umbc.edu, holt@lps.umd.edu
Pseudocode Yes Algorithm 1: Computing Randomized Embedding Based Features for OOD Data Detection
Open Source Code No The paper does not contain any explicit statement about releasing source code for the described methodology or a link to a code repository.
Open Datasets Yes For our vision experiments, similarly to the evaluation protocol from (van Amersfoort et al. 2020; Ren et al. 2019; Postels et al. 2020; Mukhoti et al. 2021) we explore MNIST variants as OOD data. In particular, we train our base model, a Le Net5 (Yann Le Cun et al. 1998) with added dropout before each layer, on MNIST and use Kuzushiji-MNIST (Clanuwat et al. 2018), not MNIST (Bulatov 2011), and Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017) as OOD data.We train the dropout Mal Conv model for 5 epochs with a batch size of 32 on the EMBER2018 dataset which consists of portable executable files (PE files) scanned by Virus Total in or before 2018 (Anderson and Roth 2018).
Dataset Splits No The paper describes train/test splits, and uses 3-fold cross-validation for logistic regression regularization, but does not specify a separate, distinct validation dataset split for the main model training.
Hardware Specification Yes Experiments were run on an 80 CPU core machine with 512GB of RAM using a single 16GB Tesla P100 GPU.
Software Dependencies No The paper mentions 'Py Torch' but does not specify its version number or other software dependencies with specific versions.
Experiment Setup Yes A dropout probability of p = 0.1 was used, and when sampling from the base neural network models to compute features for OOD detection, 32 samples are used.Training consisted of 50 epochs with a batch size of 128, where the 100 most common characters in the training set (after stripping accents) were used as the vocabulary and each datum was truncated/padded to a length of 200 characters.We train the dropout Mal Conv model for 5 epochs with a batch size of 32 on the EMBER2018 dataset