Learning with a Wasserstein Loss

Authors: Charlie Frogner, Chiyuan Zhang, Hossein Mobahi, Mauricio Araya, Tomaso A. Poggio

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate this property on a real-data tag prediction problem, using the Yahoo Flickr Creative Commons dataset, outperforming a baseline that doesn t use the metric.
Researcher Affiliation Collaboration Charlie Frogner Chiyuan Zhang Center for Brains, Minds and Machines Massachusetts Institute of Technology frogner@mit.edu, chiyuan@mit.edu Hossein Mobahi CSAIL Massachusetts Institute of Technology hmobahi@csail.mit.edu Mauricio Araya-Polo Shell International E & P, Inc. Mauricio.Araya@shell.com Tomaso Poggio Center for Brains, Minds and Machines Massachusetts Institute of Technology tp@ai.mit.edu
Pseudocode Yes Algorithm 1 Gradient of the Wasserstein loss
Open Source Code Yes Code and data are available at http://cbcl.mit.edu/wasserstein.
Open Datasets Yes using the recently released Yahoo/Flickr Creative Commons 100M dataset [23]. ... The dataset used here is available at http://cbcl.mit.edu/wasserstein.
Dataset Splits No The paper mentions training and testing sets, but does not provide specific details about a validation set or explicit split percentages for training, validation, and testing.
Hardware Specification No The paper does not provide specific details about the hardware used to run its experiments, such as CPU/GPU models or memory specifications.
Software Dependencies No The paper mentions using "word2vec [24]" and "Mat Conv Net [25]" but does not specify version numbers for these software components.
Experiment Setup Yes We train a model independently for each value of p and plot the average predicted probabilities of the different digits on the test set in Figure 4. ... Specifically, we train a linear model by minimizing W p p + KL on the training set, where controls the relative weight of KL.