Latent Support Measure Machines for Bag-of-Words Data Classification
Authors: Yuya Yoshikawa, Tomoharu Iwata, Hiroshi Sawada
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In the experiments, we show that the latent SMM achieves state-of-the-art accuracy for Bo W text classification, is robust with respect to its own hyper-parameters, and is useful to visualize words. |
| Researcher Affiliation | Collaboration | Yuya Yoshikawa Nara Institute of Science and Technology Nara, 630-0192, Japan yoshikawa.yuya.yl9@is.naist.jp Tomoharu Iwata NTT Communication Science Laboratories Kyoto, 619-0237, Japan iwata.tomoharu@lab.ntt.co.jp Hiroshi Sawada NTT Service Evolution Laboratories Kanagawa, 239-0847, Japan sawada.hiroshi@lab.ntt.co.jp |
| Pseudocode | No | The paper provides mathematical formulations and descriptions of the proposed method but does not include any pseudocode or explicitly labeled algorithm blocks. |
| Open Source Code | No | The paper refers to existing implementations for baseline methods (Med LDA, word2vec) and dataset sources, but does not state that the code for the proposed latent SMM is open-source or publicly available. |
| Open Datasets | Yes | For the evaluation, we used the following three standard multi-class text classification datasets: Web KB, Reuters-21578 and 20 Newsgroups. These datasets, which have already been preprocessed by removing short and stop words, are found in [19] and can be downloaded from the author s website1. |
| Dataset Splits | No | The paper states: 'Here we randomly chose five sets of training samples, and used the remaining samples for each of the training sets as the test set.' It describes a training and test split but does not explicitly mention a separate validation set or detail a cross-validation setup for hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'LIBSVM' and 'the author's implementation of Med LDA' and 'word2vec', but does not specify version numbers for any of these software dependencies. |
| Experiment Setup | Yes | In our experiments, we choose the optimal parameters for these methods from the following variations: γ {10 3, 10 2, , 103} in the latent SMM, SVD+SMM, word2vec+SMM and SVM with a Gaussian RBF kernel, C {2 3, 2 1, , 25, 27} in all the methods, regularizer parameter ρ {10 2, 10 1, 100}, latent dimensionality q {2, 3, 4} in the latent SMM, and the latent dimensionality of Med LDA and SVD+SMM ranges {10, 20, , 50}. |