Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

An Optimal Transport-based Latent Mixer for Robust Multi-modal Learning

Authors: Fengjiiao Gong, Angxiao Yue, Hongteng Xu

AAAI 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on multi-modal clustering and classification demonstrate that the models learned with the OTM method outperform the corresponding baselines. Experiments on multi-modal clustering and classification demonstrate the effectiveness of our method compared with the existing baselines, especially in the unaligned multi-modal scenarios.
Researcher Affiliation	Academia	1Gaoling School of Artifical Intelligence, Renmin University of China, Beijing, China 2Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China
Pseudocode	Yes	Algorithm 1: Computation of FGW distance Algorithm 2: Computation of FGW barycenter
Open Source Code	Yes	The code and more experimental results can be found at https://github.com/red Linmumu/OTM.
Open Datasets	Yes	For clustering tasks, we conduct the experiments on four conventional multi-modal dataset used in (Hu, Nie, and Li 2019; Guo et al. 2014; Gong, Nie, and Xu 2022). Each dataset contains well-aligned samples and corresponding labels, which are used only in the validation stage. The dataset for the classification and regression tasks are chosen from Multibench (Liang et al. 2021), which is a wellknown systematic large-scale multi-modal learning benchmark.
Dataset Splits	Yes	Each model is trained by five-fold cross validation. For the clustering models, we apply the clustering purity to evaluate their performance. For the classification and regression models, we apply the classification accuracy and MAE (Willmott and Matsuura 2005) to evaluate them, respectively.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers.
Experiment Setup	No	The paper describes the model architecture (two-layer multi-layer perceptrons) and mentions using K-means for clustering and five-fold cross-validation. However, it does not provide specific hyperparameters such as learning rate, batch size, number of epochs, or optimizer settings in the main text.