Variational Mixture-of-Experts Autoencoders for Multi-Modal Deep Generative Models

Authors: Yuge Shi, Siddharth N, Brooks Paige, Philip Torr

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate our model, we constructed two multi-modal scenarios to conduct experiments on. The first experiment involves many-to-many image image transforms on matching digits between the MNIST and street-view house numbers (SVHN) datasets. ... For each of these experiments, we provide both qualitative and quantitative analyses of the extent to which our model satisfies the four proposed criteria...
Researcher Affiliation Academia Yuge Shi N. Siddharth Department of Engineering Science University of Oxford {yshi, nsid}@robots.ox.ac.uk Brooks Paige Alan Turing Institute & University of Cambridge bpaige@turing.ac.uk Philip H.S. Torr Department of Engineering Science University of Oxford philip.torr@eng.ox.ac.uk
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code, data, and models are provided at this url. ... Source code for all models and experiments is available at https://github.com/iffsid/mmvae.
Open Datasets Yes Code, data, and models are provided at this url. ... Data and pre-trained models from our experiments are also available at https://github.com/iffsid/mmvae. ... We employ the images and captions from Caltech-UCSD Birds (CUB) dataset (Wah et al., 2011)...
Dataset Splits No The paper does not provide specific details on training, validation, and test dataset splits (e.g., exact percentages or sample counts) for reproducibility, beyond mentioning a 'training set' and 'test set' in general terms.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments (e.g., GPU models, CPU models, or memory specifications).
Software Dependencies No The paper mentions software components like 'Adam optimiser' and 'AMSGrad', and models like 'Res Net-101' and 'Fast Text', but does not specify any version numbers for these or other software dependencies.
Experiment Setup Yes For learning, we use the Adam optimiser (Kingma and Ba, 2014) with AMSGrad (Reddi et al., 2018), with a learning rate of 0.001. ... Here, we use CNNs for SVHN and MLPs for MNIST, with a 20d latent space. ... We use 128-dimensional latents with a Laplace likelihood on image features and a Categorical likelihood for captions.