Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Models

Authors: Yuge Shi, Brooks Paige, Philip Torr, Siddharth N

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show in experiments that our method enables data-efficient multimodal learning on challenging datasets for various multimodal variational autoencoder (VAE) models. We also show that under our proposed framework, the generative model can accurately identify related samples from unrelated ones, making it possible to make use of the plentiful unlabeled, unpaired multimodal data.
Researcher Affiliation Academia 1University of Oxford 2University College London 3University of Edinburgh 4The Alan Turing Institute
Pseudocode No The paper does not contain any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statements about releasing source code or provide a link to a code repository for the methodology described.
Open Datasets Yes MNIST-SVHN The dataset is designed to separate conceptual complexity, i.e. digit, from perceptual complexity, i.e. color, style, size. Each data pair contains 2 samples of the same digit, one from each dataset (see examples in Figure 3a). ... CUB Image-Captions We also consider a more challenging language-vision multimodal dataset, Caltech-UCSD Birds (CUB) (Welinder et al., b; Reed et al., 2016).
Dataset Splits No The paper discusses using different percentages of the original dataset (e.g., 20% of data used) for training and evaluation, but it does not specify explicit training, validation, and test splits (e.g., 70/15/15) for its datasets.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments.
Experiment Setup Yes Throughout the experiments, we take N = 5 negative samples for the contrastive objective, set γ = 2 based on analyses of ablations in appendix E, and take K = 30 samples for our IWAE estimators.