Revisiting flow generative models for Out-of-distribution detection

Authors: Dihong Jiang, Sun Sun, Yaoliang Yu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, firstly we confirm the efficacy of our method against state-of-the-art baselines through extensive experiments on several image datasets; secondly we investigate the relationship between model accuracy (e.g., the generation quality) and the OOD detection performance, and found surprisingly that they are not always positively correlated; and thirdly we show that detection in the latent space of flow models generally outperforms detection in the sample space across various OOD datasets, hence highlighting the benefits of training a flow model.
Researcher Affiliation Collaboration Dihong Jiang School of Computer Science University of Waterloo Vector Institute dihong.jiang@uwaterloo.ca Sun Sun School of Computer Science University of Waterloo National Research Council sun.sun@nrc-cnrc.gc.ca Yaoliang Yu School of Computer Science University of Waterloo Vector Institute yaoliang.yu@uwaterloo.ca
Pseudocode Yes Algorithm 1: Group OOD detection based on one-sample KS test (GOD1KS). Algorithm 2: Group OOD detection based on two-sample KS test (GOD2KS).
Open Source Code No The paper states that their PyTorch implementations were "derived from" existing public repositories (e.g., "Our Pytorch implementation of Glow was derived from Joost van Amersfoort s repository1" and "Our Pytorch implementation of Real NVP was derived from Ilya Kostrikov s repository2"), but it does not explicitly provide a link or statement for their *own* modified or experimental code.
Open Datasets Yes Grayscale image datasets: MNIST (Le Cun et al., 1998): MNIST is a dataset of handwritten digits, including 10 classes (from digit 0 to 9) and 70000 images in total. Each image is in 1 28 28. FMNIST (Xiao et al., 2017): FMNIST is a dataset of Zalando s article images with 10 classes of clothes and shoes. ... KMNIST (Clanuwat et al., 2018): ... Omniglot (Lake et al., 2015): ... RGB image datasets: CIFAR-10/100 (Krizhevsky et al., 2009): ... SVHN (Netzer et al., 2011): ... LSUN (Yu et al., 2015): ... Celeb A (Liu et al., 2015): ...
Dataset Splits Yes We use the official training and test split for all datasets, and we create the validation set by randomly holding out 10% from the training split.
Hardware Specification No The paper does not specify any particular GPU, CPU, or other hardware models used for the experiments. It only mentions general resources in the Acknowledgments: "Resources used in preparing this research were provided, in part, by the Province of Ontario, the Government of Canada through CIFAR, and companies sponsoring the Vector Institute."
Software Dependencies No The paper mentions "Our Pytorch implementation" but does not specify a version number for PyTorch or any other software libraries used.
Experiment Setup Yes Glow: ... The learning rate is set as 1e-5 for grayscale image datasets and 1e-3 for RGB image datasets. The optimizer is Adam with a weight decay of 1e-6. Real NVP: ... The learning rate is set as 1e-6 for grayscale image datasets and 1e-5 for RGB image datasets. The optimizer is Adam with a weight decay of 1e-6.