Latent Support Measure Machines for Bag-of-Words Data Classification

Authors: Yuya Yoshikawa, Tomoharu Iwata, Hiroshi Sawada

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In the experiments, we show that the latent SMM achieves state-of-the-art accuracy for Bo W text classification, is robust with respect to its own hyper-parameters, and is useful to visualize words.
Researcher Affiliation Collaboration Yuya Yoshikawa Nara Institute of Science and Technology Nara, 630-0192, Japan yoshikawa.yuya.yl9@is.naist.jp Tomoharu Iwata NTT Communication Science Laboratories Kyoto, 619-0237, Japan iwata.tomoharu@lab.ntt.co.jp Hiroshi Sawada NTT Service Evolution Laboratories Kanagawa, 239-0847, Japan sawada.hiroshi@lab.ntt.co.jp
Pseudocode No The paper provides mathematical formulations and descriptions of the proposed method but does not include any pseudocode or explicitly labeled algorithm blocks.
Open Source Code No The paper refers to existing implementations for baseline methods (Med LDA, word2vec) and dataset sources, but does not state that the code for the proposed latent SMM is open-source or publicly available.
Open Datasets Yes For the evaluation, we used the following three standard multi-class text classification datasets: Web KB, Reuters-21578 and 20 Newsgroups. These datasets, which have already been preprocessed by removing short and stop words, are found in [19] and can be downloaded from the author s website1.
Dataset Splits No The paper states: 'Here we randomly chose five sets of training samples, and used the remaining samples for each of the training sets as the test set.' It describes a training and test split but does not explicitly mention a separate validation set or detail a cross-validation setup for hyperparameter tuning.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using 'LIBSVM' and 'the author's implementation of Med LDA' and 'word2vec', but does not specify version numbers for any of these software dependencies.
Experiment Setup Yes In our experiments, we choose the optimal parameters for these methods from the following variations: γ {10 3, 10 2, , 103} in the latent SMM, SVD+SMM, word2vec+SMM and SVM with a Gaussian RBF kernel, C {2 3, 2 1, , 25, 27} in all the methods, regularizer parameter ρ {10 2, 10 1, 100}, latent dimensionality q {2, 3, 4} in the latent SMM, and the latent dimensionality of Med LDA and SVD+SMM ranges {10, 20, , 50}.