Subspace Detours: Building Transport Plans that are Optimal on Subspace Projections

Authors: Boris Muzellec, Marco Cuturi

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We consider applications to semantic mediation between elliptic word embeddings and domain adaptation with Gaussian mixture models. In section 6 we showcase the behavior of MK and MI transports on (noisy) synthetic data, show how using a mediating subspace can be applied to selecting meanings for polysemous elliptical word embeddings, and experiment using MK maps with the minimizing algorithm on a domain adaptation task with Gaussian mixture models.
Researcher Affiliation Collaboration Boris Muzellec CREST, ENSAE boris.muzellec@ensae.fr Marco Cuturi Google Brain and CREST, ENSAE cuturi@google.com
Pseudocode Yes Algorithm 1 MK Subspace Selection
Open Source Code No The paper does not provide a statement or link to open-source code for the described methodology.
Open Datasets Yes We use the Office Home dataset [31], which comprises 15000 images from 65 different classes across 4 domains: Art, Clipart, Product and Real World.
Dataset Splits No The paper does not explicitly provide specific training/validation/test splits, percentages, or absolute sample counts for data partitioning.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup Yes First, a k-means quantization of both images is computed. Then, the colors of the pixels within each source cluster are modified according to the optimal transport map between both color distributions. We compare this approach with classic full OT maps and a sliced OT approach (with 100 random projections). We represent the source as a GMM by fitting one Gaussian per source class and defining mixture weights proportional to class frequencies, and we fit a GMM with the same number of components on the target. We use Algorithm 1 between the empirical covariance matrices of the source and target datasets to select the supporting subspace E, for different values of the supporting dimension k (Figure 7).