reproducibilityindex.ai

Hierarchical VAEs Know What They Don’t Know

Authors: Jakob D. Havtorn, Jes Frellsen, Søren Hauberg, Lars Maaløe

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We benchmark the method on a vast set of data and model combinations and achieve state-of-the-art results on out-of-distribution detection.
Researcher Affiliation	Collaboration	1Department of Applied Mathematics and Computer Science, Technical University of Denmark, Kongens Lyngby, Denmark 2Corti AI, Copenhagen, Denmark.
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. The methods are described using mathematical equations and descriptive text.
Open Source Code	Yes	Source code available at github.com/larsmaaloee/BIVA and github.com/vlievin/biva-pytorch. ... Source code available at github.com/jakobhavtorn/hvae-oodd
Open Datasets	Yes	We follow existing literature (Nalisnick et al., 2019a; Hendrycks et al., 2019) and evaluate our method by setting up OOD detection tasks from Fashion MNIST (Xiao et al., 2017) to MNIST (Le Cun et al., 1998) and from CIFAR10 (Krizhevsky, 2009) to SVHN (Netzer et al., 2011).
Dataset Splits	No	The paper states 'We use the standard train/test splits for the datasets.' This explicitly mentions training and testing splits but does not include a specific validation split percentage or count. While validation might be implied for model development, it is not explicitly stated as a separate split for reproduction.
Hardware Specification	No	The paper does not specify any particular hardware used for running the experiments (e.g., GPU models, CPU types, or memory). It only states that models are implemented in PyTorch.
Software Dependencies	No	The paper mentions 'We implement our models in Py Torch (Paszke et al., 2017)'. However, it does not provide specific version numbers for PyTorch or any other software libraries or dependencies used, which is required for reproducible description.
Experiment Setup	Yes	We use a Bernoulli output distribution for Fashion MNIST/MNIST and a discretized mixture of logistics output distribution (Salimans et al., 2017) for CIFAR10/SVHN. We use L = 3 for grey-scale images and L = 4 for natural images. For CIFAR/SVHN, we also train a BIVA model (Maaløe et al., 2019) with L = 10 and similar conﬁguration as used by the original paper. All models are trained by optimizing the ELBO in (1).