Relating by Contrasting: A Data-efficient Framework for Multimodal Generative Models
Authors: Yuge Shi, Brooks Paige, Philip Torr, Siddharth N
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show in experiments that our method enables data-efficient multimodal learning on challenging datasets for various multimodal variational autoencoder (VAE) models. We also show that under our proposed framework, the generative model can accurately identify related samples from unrelated ones, making it possible to make use of the plentiful unlabeled, unpaired multimodal data. |
| Researcher Affiliation | Academia | 1University of Oxford 2University College London 3University of Edinburgh 4The Alan Turing Institute |
| Pseudocode | No | The paper does not contain any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code or provide a link to a code repository for the methodology described. |
| Open Datasets | Yes | MNIST-SVHN The dataset is designed to separate conceptual complexity, i.e. digit, from perceptual complexity, i.e. color, style, size. Each data pair contains 2 samples of the same digit, one from each dataset (see examples in Figure 3a). ... CUB Image-Captions We also consider a more challenging language-vision multimodal dataset, Caltech-UCSD Birds (CUB) (Welinder et al., b; Reed et al., 2016). |
| Dataset Splits | No | The paper discusses using different percentages of the original dataset (e.g., 20% of data used) for training and evaluation, but it does not specify explicit training, validation, and test splits (e.g., 70/15/15) for its datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments. |
| Experiment Setup | Yes | Throughout the experiments, we take N = 5 negative samples for the contrastive objective, set γ = 2 based on analyses of ablations in appendix E, and take K = 30 samples for our IWAE estimators. |