Hierarchical VAEs Know What They Don’t Know

Authors: Jakob D. Havtorn, Jes Frellsen, Søren Hauberg, Lars Maaløe

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We benchmark the method on a vast set of data and model combinations and achieve state-of-the-art results on out-of-distribution detection.
Researcher Affiliation Collaboration 1Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark 2Corti AI, Copenhagen, Denmark.
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. The methods are described using mathematical equations and descriptive text.
Open Source Code Yes Source code available at github.com/larsmaaloee/BIVA and github.com/vlievin/biva-pytorch. ... Source code available at github.com/jakobhavtorn/hvae-oodd
Open Datasets Yes We follow existing literature (Nalisnick et al., 2019a; Hendrycks et al., 2019) and evaluate our method by setting up OOD detection tasks from Fashion MNIST (Xiao et al., 2017) to MNIST (Le Cun et al., 1998) and from CIFAR10 (Krizhevsky, 2009) to SVHN (Netzer et al., 2011).
Dataset Splits No The paper states 'We use the standard train/test splits for the datasets.' This explicitly mentions training and testing splits but does not include a specific validation split percentage or count. While validation might be implied for model development, it is not explicitly stated as a separate split for reproduction.
Hardware Specification No The paper does not specify any particular hardware used for running the experiments (e.g., GPU models, CPU types, or memory). It only states that models are implemented in PyTorch.
Software Dependencies No The paper mentions 'We implement our models in Py Torch (Paszke et al., 2017)'. However, it does not provide specific version numbers for PyTorch or any other software libraries or dependencies used, which is required for reproducible description.
Experiment Setup Yes We use a Bernoulli output distribution for Fashion MNIST/MNIST and a discretized mixture of logistics output distribution (Salimans et al., 2017) for CIFAR10/SVHN. We use L = 3 for grey-scale images and L = 4 for natural images. For CIFAR/SVHN, we also train a BIVA model (Maaløe et al., 2019) with L = 10 and similar configuration as used by the original paper. All models are trained by optimizing the ELBO in (1).