Out of Distribution Data Detection Using Dropout Bayesian Neural Networks
Authors: Andre T. Nguyen, Fred Lu, Gary Lopez Munoz, Edward Raff, Charles Nicholas, James Holt7877-7885
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate the value of randomized embedding based features across three different OOD data detection tasks in the vision, language, and malware domains. All experiments were implemented in Py Torch (Paszke et al. 2019), and neural networks were optimized using Adam with the default recommended settings (Kingma and Ba 2015). |
| Researcher Affiliation | Collaboration | Andre T. Nguyen,1,2,3 Fred Lu,1,2,3 Gary Lopez Munoz,1,2 Edward Raff,1,2,3 Charles Nicholas,3 James Holt1 1Laboratory for Physical Sciences 2Booz Allen Hamilton 3University of Maryland, Baltimore County andre@lps.umd.edu, lu_fred@bah.com, dlmgary@lps.umd.edu, edraff@lps.umd.edu, nicholas@umbc.edu, holt@lps.umd.edu |
| Pseudocode | Yes | Algorithm 1: Computing Randomized Embedding Based Features for OOD Data Detection |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | For our vision experiments, similarly to the evaluation protocol from (van Amersfoort et al. 2020; Ren et al. 2019; Postels et al. 2020; Mukhoti et al. 2021) we explore MNIST variants as OOD data. In particular, we train our base model, a Le Net5 (Yann Le Cun et al. 1998) with added dropout before each layer, on MNIST and use Kuzushiji-MNIST (Clanuwat et al. 2018), not MNIST (Bulatov 2011), and Fashion-MNIST (Xiao, Rasul, and Vollgraf 2017) as OOD data.We train the dropout Mal Conv model for 5 epochs with a batch size of 32 on the EMBER2018 dataset which consists of portable executable files (PE files) scanned by Virus Total in or before 2018 (Anderson and Roth 2018). |
| Dataset Splits | No | The paper describes train/test splits, and uses 3-fold cross-validation for logistic regression regularization, but does not specify a separate, distinct validation dataset split for the main model training. |
| Hardware Specification | Yes | Experiments were run on an 80 CPU core machine with 512GB of RAM using a single 16GB Tesla P100 GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify its version number or other software dependencies with specific versions. |
| Experiment Setup | Yes | A dropout probability of p = 0.1 was used, and when sampling from the base neural network models to compute features for OOD detection, 32 samples are used.Training consisted of 50 epochs with a batch size of 128, where the 100 most common characters in the training set (after stripping accents) were used as the vocabulary and each datum was truncated/padded to a length of 200 characters.We train the dropout Mal Conv model for 5 epochs with a batch size of 32 on the EMBER2018 dataset |