reproducibilityindex.ai

Training Transitive and Commutative Multimodal Transformers with LoReTTa

Authors: Manuel Tran, Yashin Dicente Cid, Amal Lahiani, Fabian Theis, Tingying Peng, Eldad Klaiman

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We extensively evaluate our approach on a synthetic, medical, and reinforcement learning dataset.
Researcher Affiliation	Collaboration	1Roche Diagnostics Gmb H, 2Roche Diagnostics S.L. 3Technical University of Munich, 4Helmholtz Munich
Pseudocode	Yes	We also publish the pseudocode and data processing pipeline.
Open Source Code	No	The paper mentions publishing "pseudocode and data processing pipeline" but does not provide concrete access (e.g., a specific repository link or explicit statement of code release) for the implementation of its methodology.
Open Datasets	Yes	The speech dataset features about 40,000 spectrograms from Audio MNIST [31], the vision dataset comprises 70,000 images from MNIST [34], and the language dataset consists of 130,000 documents from Wine Reviews [60].
Dataset Splits	No	The paper describes how specific datasets were constructed or split for experimental scenarios (e.g., non-overlapping samples for bimodal datasets, or subsets for simulating missing modalities) but does not provide explicit train/validation/test percentages or counts for model training or a general splitting methodology for reproducibility.
Hardware Specification	Yes	We trained all of our models on a single NVIDIA A100-SXM4-40GB GPU using Py Torch 2.0.
Software Dependencies	Yes	We trained all of our models on a single NVIDIA A100-SXM4-40GB GPU using Py Torch 2.0.
Experiment Setup	Yes	For optimization, we choose the Adam W algorithm with a learning rate of 6e-4, a weight decay factor of 0.1, and a gradient clipping of 1. The learning rate undergoes a 10-fold decay using cosine annealing and a linear warm-up during the first couple hundred steps.