reproducibilityindex.ai

Geometric Dataset Distances via Optimal Transport

Authors: David Alvarez-Melis, Nicolo Fusi

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our results show that this novel distance provides meaningful comparison of datasets, and correlates well with transfer learning hardness across various experimental settings and datasets. We provide extensive empirical evidence that this distance is highly predictive of transfer learning success across various domains, tasks and data modalities
Researcher Affiliation	Industry	David Alvarez-Melis Microsoft Research, New England alvarez.melis@microsoft.com Nicolò Fusi Microsoft Research, New England nfusi@microsoft.com
Pseudocode	No	The paper describes the computational steps and theoretical foundations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement or link regarding the availability of its source code.
Open Datasets	Yes	We start with a simple domain adaptation setting, using USPS, MNIST [36] and three of its extensions: Fashion-MNIST [54], KMNIST [11] and the letters split of EMNIST [12].
Dataset Splits	No	The paper mentions using specific datasets (e.g., MNIST, USPS, CIFAR-10) but does not provide explicit details on how the data was split into training, validation, and test sets (e.g., exact percentages or sample counts for each split).
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments (e.g., CPU/GPU models, memory specifications, or cloud computing instances).
Software Dependencies	No	The paper mentions using the 'torchtext library' and the 'BERT model', but does not provide specific version numbers for these or any other software dependencies crucial for reproducibility.
Experiment Setup	Yes	Training details can be found in Appendix E. For the *NIST experiments (Section 6.2), we use the Adam optimizer with an initial learning rate of 10e−4 and a batch size of 128.