Contrasting Multiple Representations with the Multi-Marginal Matching Gap
Authors: Zoe Piran, Michal Klein, James Thornton, Marco Cuturi
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate improved performance over multiview extensions of pairwise losses, for both self-supervised and multimodal tasks. |
| Researcher Affiliation | Collaboration | 1Apple 2Hebrew University Jerusalem. |
| Pseudocode | Yes | Algorithm 1 Multi-marginal Sinkhorn (MM-S) |
| Open Source Code | No | The paper states 'To perform experiments, we implemented the multi-marginal Sinkhorn algorithm (Alg. 1) in Py Torch Paszke et al. (2019)' but does not provide an explicit statement or link for the release of their own M3G implementation code. The GitHub link provided is for a codebase they reused, not necessarily their own. |
| Open Datasets | Yes | We test the M3G loss in an SSL setting (Image Net-1k) and two multimodal tasks (Domain Net and Physio Net)., citing (Deng et al., 2009), (Peng et al., 2019), and (Goldberger et al., 2000; Ghassemi et al., 2018; Kemp et al., 2000) respectively. |
| Dataset Splits | Yes | We consider a domain adaptation (DA) task, where the goal is to learn a common encoder, followed by one or multiple classifiers, using labeled data from multiple domains. We quantify the generalization power of this pre-trained encoder with a classification task, tested on data coming from a new, completely unseen domain... We pick one domain that acts as the unseen modality, and train representations on the k = 5 remaining domains. and The train data contains segmented samples of 994 individuals, and the evaluation dataset, Sleep EDFx (Goldberger et al., 2000; Kemp et al., 2000), contains 153 nights of sleep recordings from 78 individuals. |
| Hardware Specification | Yes | All results are given for the same per GPU batch size (n = 64), 300 epochs, ε = 0.2 for M3G, run on 4 nodes of 8 A100 GPUs. |
| Software Dependencies | No | The paper mentions implementing parts in 'Py Torch Paszke et al. (2019)' but does not provide specific version numbers for PyTorch or other software dependencies. |
| Experiment Setup | Yes | In Table A1 we provide the hyperparameters used to train Image Net-1k and Domain Net models. ... Batch size 2048 (Image Net-1k) 512 (Domain Net)... Training duration (epochs) 300... Optimizer Adam W... Base learning rate 6.5 10 4... Per GPU Batch size 64 (Image Net-1k) 16 (Domain Net). |