Supervised Word Mover's Distance

Authors: Gao Huang, Chuan Guo, Matt J. Kusner, Yu Sun, Fei Sha, Kilian Q. Weinberger

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate S-WMD on eight real-world text classification tasks on which it consistently outperforms almost all of our 26 competitive baselines. We evaluate all approaches on 8 document datasets in the settings of news categorization, sentiment analysis, and product identification, among others. Table 1 describes the classification tasks as well as the size and number of classes C of each of the datasets. We evaluate against the following document representation/distance methods: ... Table 2: The k NN test error for all datasets and distances.
Researcher Affiliation Academia Gao Huang , Chuan Guo Cornell University {gh349,cg563}@cornell.edu Matt J. Kusner Alan Turing Institute, University of Warwick mkusner@turing.ac.uk Yu Sun, Kilian Q. Weinberger Cornell University {ys646,kqw4}@cornell.edu Fei Sha University of California, Los Angeles feisha@cs.ucla.edu
Pseudocode Yes Algorithm 1 S-WMD
Open Source Code Yes Our code is implemented in Matlab and is freely available at https://github.com/gaohuang/S-WMD.
Open Datasets Yes We evaluate S-WMD on 8 different document corpora... Table 1: The document datasets (and their descriptions) used for visualization and evaluation. ... REUTERS news dataset (train/test split [3]) ... TWITTER tweets categorized by sentiment [31] ... 20NEWS canonical news article dataset [3]
Dataset Splits No For datasets that do not have a predefined train/test split: BBCSPORT, TWITTER, RECIPE, CLASSIC, and AMAZON we average results over five 70/30 train/test splits and report standard errors. The paper does not explicitly state training/validation/test splits or mention a specific validation split percentage for the data.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments.
Software Dependencies No Our code is implemented in Matlab. The paper does not provide specific version numbers for software dependencies or libraries.
Experiment Setup Yes In our experiments, we use λ = 10, which leads to a nice trade-off between speed and approximation accuracy. In our experiments, we set B = 32 and N = 200, and computing the gradient at each iteration can be done in seconds.