Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Optimal Transport for Domain Adaptation through Gaussian Mixture Models

Authors: Eduardo Fernandes Montesuma, Fred Maurice NGOLE MBOULA, Antoine Souloumiac

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We experiment with 9 benchmarks, with a total of 85 adaptation tasks, showing that our methods are more efficient than previous shallow domain adaptation methods, and they scale well with number of samples n and dimensions d.
Researcher Affiliation Academia Eduardo Fernandes Montesuma EMAIL Université Paris-Saclay, CEA, List, F-91120 Palaiseau France Fred Ngolè Mboula EMAIL Université Paris-Saclay, CEA, List, F-91120 Palaiseau France Antoine Souloumiac EMAIL Université Paris-Saclay, CEA, List, F-91120 Palaiseau France
Pseudocode Yes Algorithm 1: Fitting procedure for GMMs. Algorithm 2: Fitting procedure for labeled GMMs. Algorithm 3: Pseudo-label target GMM. Algorithm 4: Tweight.
Open Source Code Yes Our code is available at https://github.com/eddardd/gmm-otda
Open Datasets Yes We experiment with 9 benchmarks, with a total of 85 adaptation tasks... Caltech-Office (Gong et al., 2012), Image CLEF (Caputo et al., 2014), Office31 (Saenko et al., 2010), Office-Home (Venkateswara et al., 2017), (MNIST, USPS, SVHN) (Seguy et al., 2017) and Vis DA Peng et al. (2017)... Furthermore, for completeness, we consider the Linearly Alignable Optimal Transport (La OT) strategy of Struckmeier et al. (2023). The data for these benchmarks is publicly available here4, and here5.
Dataset Splits Yes We run empirical OT methods on a sub-sample of n S = n T = 15000 samples. Parametric versions of OT methods, such as OTDAaffine and GMM-OTDA are run with the full datasets... Average classification accuracy with confidence intervals over a 5-fold cross-validation.
Hardware Specification No The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. It mentions using Res Nets and ViTs which are models, not hardware, and discusses pre-training and feature extraction, but no hardware specifics.
Software Dependencies No The paper mentions machine learning concepts and frameworks (e.g., Res Net, Sinkhorn algorithm) but does not provide specific version numbers for any software dependencies like Python, PyTorch, TensorFlow, CUDA, or other libraries.
Experiment Setup No The paper discusses aspects of the experimental setup such as pre-training Res Nets and using specific parameters for their GMM-OT method (K, ϵ, τ), but it does not provide concrete hyperparameter values for the neural network training (e.g., learning rate, batch size, number of epochs for feature extractor training) or detailed system-level training configurations.