Identifiability Results for Multimodal Contrastive Learning

Authors: Imant Daunhawer, Alice Bizeul, Emanuele Palumbo, Alexander Marx, Julia E Vogt

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically verify our identifiability results with numerical simulations and corroborate our findings on a complex multimodal dataset of image/text pairs. The goal of our experiments is to test whether contrastive learning can block-identify content in the multimodal setting, as described by Theorem 1.
Researcher Affiliation Academia Imant Daunhawer1, , Alice Bizeul1,2, Emanuele Palumbo1,2, Alexander Marx1,2, & Julia E. Vogt1, 1 Department of Computer Science, ETH Zurich 2 ETH AI Center, ETH Zurich
Pseudocode No The paper describes theoretical models and experimental setups but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes The code is provided in our github repository.5
Open Datasets No The paper describes extending the Causal3DIdent and CLEVR datasets to create Multimodal3DIdent, and provides code to generate it. However, it does not provide a direct link, DOI, or explicit statement that the specific generated instance of the Multimodal3DIdent dataset used in their experiments is publicly available for download.
Dataset Splits Yes Table 2b: # Samples (train / val / test) 125,000 / 10,000 / 10,000
Hardware Specification No The paper mentions 'Experiments were performed on the ETH Zurich Leonhard cluster' but does not provide specific details such as GPU/CPU models, processor types, or memory specifications.
Software Dependencies No The paper mentions software like Blender and the Adam optimizer, and uses architectures like ResNet-18, but it does not specify version numbers for any software dependencies.
Experiment Setup Yes Table 2a: Parameters used for the numerical simulation. Table 2b: Parameters used for Multimodal3DIdent.