reproducibilityindex.ai

Why Normalizing Flows Fail to Detect Out-of-Distribution Data

Authors: Polina Kirichenko, Pavel Izmailov, Andrew G. Wilson

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate why normalizing ﬂows perform poorly for OOD detection. We demonstrate that ﬂows learn local pixel correlations and generic image-to-latent-space transformations which are not speciﬁc to the target image datasets, focusing on ﬂows based on coupling layers. We show that by modifying the architecture of ﬂow coupling layers we can bias the ﬂow towards learning the semantic structure of the target data, improving OOD detection.
Researcher Affiliation	Academia	Polina Kirichenko pk1822@nyu.edu New York University Pavel Izmailov pi390@nyu.edu New York University Andrew Gordon Wilson andrewgw@cims.nyu.edu New York University
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	We also provide code at https://github.com/Polina Kirichenko/flows_ood.
Open Datasets	Yes	In Figure 1(a), we show the log-likelihood histogram for a Real NVP ﬂow model [10] trained on the Image Net dataset [37] subsampled to 64 64 resolution. The ﬂow assigns higher likelihood to both the Celeb A dataset of celebrity photos, and the SVHN dataset of images of house numbers, compared to the target Image Net dataset. ... trained on Fashion MNIST ... trained on Celeb A using an SVHN image as OOD. ... We extract embeddings for CIFAR-10, Celeb A and SVHN using an Efﬁcient Net [43] pretrained on Image Net [37].
Dataset Splits	Yes	One approach is to choose a likelihood threshold on a validation dataset, e.g. to satisfy a desired false positive rate, and during test time identify inputs which have likelihood lower than as OOD. ... In practice, ﬂows do not seem to overﬁt, assigning similar likelihood distributions to train and and test (see e.g. Figure 1(a)).
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	For the details of the visualization procedure and the training setup please see Appendices E and C. ... Real NVP model with 2 coupling layers and checkerboard masks ... 3-layer Real NVP with horizontal masks ... We construct ﬂows of exactly the same size and architecture (Real NVP with 8 coupling layers and no squeeze layers) with each of these masks, trained on Celeb A and Fashion MNIST. ... To do so, we introduce a bottleneck to the st-networks: a pair of fully-connected layers projecting to a space of dimension l and back to the original input dimension. We insert these layers after the middle layer of the st-network. If the latent dimension l is small, the st-network cannot simply reproduce its input as its output, and thus cannot exploit the local pixel correlations discussed in Section 6. Passing information through multiple layers with a low-dimensional bottleneck also reduces the effect of coupling layer co-adaptation. We train a Real NVP ﬂow varying the latent dimension l on Celeb A and on Fashion MNIST.