reproducibilityindex.ai

Contrasting with Symile: Simple Model-Agnostic Representation Learning for Unlimited Modalities

Authors: Adriel Saporta, Aahlad Manas Puli, Mark Goldstein, Rajesh Ranganath

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that Symile outperforms pairwise CLIP on cross-modal classification and retrieval across several experiments including on a multilingual dataset of images, text and audio of over 33M examples and a clinical dataset of chest X-rays, electrocardiograms, and laboratory measurements. We show that Symile retains its advantage over pairwise CLIP even with modalities missing in the data. We publicly release both the multilingual and the clinical datasets, which are specifically designed to test a model s ability to capture higher-order information between three distinct high-dimensional data types.
Researcher Affiliation	Academia	Adriel Saporta Aahlad Puli Mark Goldstein Rajesh Ranganath New York University
Pseudocode	Yes	Algorithm 1 Pseudocode for implementation of Symile with O(N) negative sampling
Open Source Code	Yes	All datasets and code used in this work are publicly available at https://github.com/rajesh-lab/symile.
Open Datasets	Yes	All datasets and code used in this work are publicly available at https://github.com/rajesh-lab/symile.
Dataset Splits	Yes	For each of the three datasets, 10M training, 500K validation, and 500K test samples were generated. ... We split our dataset (11,622 admissions) into a train/validation development set (95% of patients) and a test set (5% of patients), ensuring there is no patient overlap across the splits.
Hardware Specification	Yes	Experiments were conducted with 16 CPUs, 200GB of RAM, and a single NVIDIA A100 80GB PCIe GPU.
Software Dependencies	No	The paper mentions software like AdamW optimizer [32], Whisper [41] (Hugging Face model id openai/whisper-large-v3), CLIP [40] (Hugging Face model id openai/clip-vit-large-patch14), and XLM-RoBERTa [13] (Hugging Face model id xlm-roberta-large). While some specific models are referenced, explicit Python library versions (e.g., PyTorch 1.9) are not provided, making it difficult to fully reproduce the software environment.
Experiment Setup	Yes	For all experiments, we use the Adam W optimizer [32]. Following [40], the temperature parameter τ is directly optimized during training as a multiplicative scalar to avoid the need for separate hyperparameter tuning. ... Both Symile and CLIP are trained for 100 epochs using a batch size of 1000, a learning rate of 0.1, and a weight decay of 0.01. The learned temperature parameter τ is initialized to 0.3. The Symile loss is trained with O(N) negative sampling.