MIWAE: Deep Generative Modelling and Imputation of Incomplete Data Sets

Authors: Pierre-Alexandre Mattei, Jes Frellsen

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate our approach by training a convolutional DLVM on incomplete static binarisations of MNIST. Moreover, on various continuous data sets, we show that MIWAE provides extremely accurate single imputations, and is highly competitive with state-of-the-art methods.
Researcher Affiliation Academia 1Department of Computer Science, IT University of Copenhagen, Denmark. Correspondence to: Pierre-Alexandre Mattei <pima@itu.dk>, Jes Frellsen <jefr@itu.dk>.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. Methods are described in prose.
Open Source Code No The paper does not provide concrete access to source code for the methodology. There is no link or explicit statement of code release.
Open Datasets Yes We illustrate the features of MIWAE by training a DLVM on an incomplete version of the static binarisation of MNIST. We consider a simple setting: with 50% of the pixels missing uniformly at random (in a MCAR fashion). (Dua & Efi, 2017). URL http://archive.ics.uci.edu/ml.
Dataset Splits Yes To compare models, we evaluate estimates of their test log-likelihood obtained using importance sampling with 5000 samples and an inference network refitted on the test set, as suggested by Cremer et al. (2018) and Mattei & Frellsen (2018b).
Hardware Specification No The paper does not provide specific hardware details (GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4).
Experiment Setup Yes The intrinsic dimension d is fixed to 10, which may be larger than the actual number of features in the data, but DLVMs are known to automatically ignore some latent dimensions (Dai et al., 2018); both encoder and decoder are multi-layer perceptrons with 3 hidden layers (with 128 hidden units) and tanh activations; we use products of Student s t for the variational family (following Domke & Sheldon, 2018) and the observation model (following Takahashi et al., 2018). We perform 500 000 gradient steps for all data sets; no regularisation scheme is used, but the observation model is constrained so that the eigenvalues of its covariances are larger than 0.01 (as suggested by Mattei & Frellsen, 2018a).