Supervised Word Mover's Distance
Authors: Gao Huang, Chuan Guo, Matt J. Kusner, Yu Sun, Fei Sha, Kilian Q. Weinberger
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate S-WMD on eight real-world text classification tasks on which it consistently outperforms almost all of our 26 competitive baselines. We evaluate all approaches on 8 document datasets in the settings of news categorization, sentiment analysis, and product identification, among others. Table 1 describes the classification tasks as well as the size and number of classes C of each of the datasets. We evaluate against the following document representation/distance methods: ... Table 2: The k NN test error for all datasets and distances. |
| Researcher Affiliation | Academia | Gao Huang , Chuan Guo Cornell University {gh349,cg563}@cornell.edu Matt J. Kusner Alan Turing Institute, University of Warwick mkusner@turing.ac.uk Yu Sun, Kilian Q. Weinberger Cornell University {ys646,kqw4}@cornell.edu Fei Sha University of California, Los Angeles feisha@cs.ucla.edu |
| Pseudocode | Yes | Algorithm 1 S-WMD |
| Open Source Code | Yes | Our code is implemented in Matlab and is freely available at https://github.com/gaohuang/S-WMD. |
| Open Datasets | Yes | We evaluate S-WMD on 8 different document corpora... Table 1: The document datasets (and their descriptions) used for visualization and evaluation. ... REUTERS news dataset (train/test split [3]) ... TWITTER tweets categorized by sentiment [31] ... 20NEWS canonical news article dataset [3] |
| Dataset Splits | No | For datasets that do not have a predefined train/test split: BBCSPORT, TWITTER, RECIPE, CLASSIC, and AMAZON we average results over five 70/30 train/test splits and report standard errors. The paper does not explicitly state training/validation/test splits or mention a specific validation split percentage for the data. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments. |
| Software Dependencies | No | Our code is implemented in Matlab. The paper does not provide specific version numbers for software dependencies or libraries. |
| Experiment Setup | Yes | In our experiments, we use λ = 10, which leads to a nice trade-off between speed and approximation accuracy. In our experiments, we set B = 32 and N = 200, and computing the gradient at each iteration can be done in seconds. |