Generalized Multimodal ELBO
Authors: Thomas M. Sutter, Imant Daunhawer, Julia E Vogt
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In extensive experiments, we demonstrate the advantage of the proposed method compared to state-of-the-art models in selfsupervised, generative learning tasks. |
| Researcher Affiliation | Academia | Department of Computer Science ETH Zurich 8092 Zurich, Switzerland |
| Pseudocode | No | The paper does not contain explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The detailed architectures can also be looked up in the released code. |
| Open Datasets | Yes | We introduce a new dataset called Poly MNIST with 5 simplified modalities. Additionally, we evaluate all models on the trimodal matching digits dataset MNIST-SVHN-Text and the challenging bimodal Celeba dataset with images and text. The latter two were introduced in Sutter et al. (2020). |
| Dataset Splits | Yes | In total there are 60, 000 tuples of training examples and 10, 000 tuples of test examples and we make sure that no two MNIST digits were used in both the training and test set. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, or cloud computing instance types used for experiments. |
| Software Dependencies | No | The paper mentions using "scikit-learn" and an "Adam optimizer (Kingma & Ba, 2014)" but does not provide specific version numbers for these software components or the deep learning framework used. |
| Experiment Setup | Yes | The latent space dimension is set to 20 for all modalities, models and runs. The results in tables 2 to 4 are generated with β = 5.0. We train all models for 150 epochs. ... We use an Adam optimizer ... with an initial learning rate 0.001. |