Geometric Dataset Distances via Optimal Transport

Authors: David Alvarez-Melis, Nicolo Fusi

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our results show that this novel distance provides meaningful comparison of datasets, and correlates well with transfer learning hardness across various experimental settings and datasets. We provide extensive empirical evidence that this distance is highly predictive of transfer learning success across various domains, tasks and data modalities
Researcher Affiliation Industry David Alvarez-Melis Microsoft Research, New England alvarez.melis@microsoft.com Nicolò Fusi Microsoft Research, New England nfusi@microsoft.com
Pseudocode No The paper describes the computational steps and theoretical foundations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement or link regarding the availability of its source code.
Open Datasets Yes We start with a simple domain adaptation setting, using USPS, MNIST [36] and three of its extensions: Fashion-MNIST [54], KMNIST [11] and the letters split of EMNIST [12].
Dataset Splits No The paper mentions using specific datasets (e.g., MNIST, USPS, CIFAR-10) but does not provide explicit details on how the data was split into training, validation, and test sets (e.g., exact percentages or sample counts for each split).
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments (e.g., CPU/GPU models, memory specifications, or cloud computing instances).
Software Dependencies No The paper mentions using the 'torchtext library' and the 'BERT model', but does not provide specific version numbers for these or any other software dependencies crucial for reproducibility.
Experiment Setup Yes Training details can be found in Appendix E. For the *NIST experiments (Section 6.2), we use the Adam optimizer with an initial learning rate of 10e−4 and a batch size of 128.