reproducibilityindex.ai

Latent Support Measure Machines for Bag-of-Words Data Classification

Authors: Yuya Yoshikawa, Tomoharu Iwata, Hiroshi Sawada

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In the experiments, we show that the latent SMM achieves state-of-the-art accuracy for Bo W text classiﬁcation, is robust with respect to its own hyper-parameters, and is useful to visualize words.
Researcher Affiliation	Collaboration	Yuya Yoshikawa Nara Institute of Science and Technology Nara, 630-0192, Japan yoshikawa.yuya.yl9@is.naist.jp Tomoharu Iwata NTT Communication Science Laboratories Kyoto, 619-0237, Japan iwata.tomoharu@lab.ntt.co.jp Hiroshi Sawada NTT Service Evolution Laboratories Kanagawa, 239-0847, Japan sawada.hiroshi@lab.ntt.co.jp
Pseudocode	No	The paper provides mathematical formulations and descriptions of the proposed method but does not include any pseudocode or explicitly labeled algorithm blocks.
Open Source Code	No	The paper refers to existing implementations for baseline methods (Med LDA, word2vec) and dataset sources, but does not state that the code for the proposed latent SMM is open-source or publicly available.
Open Datasets	Yes	For the evaluation, we used the following three standard multi-class text classiﬁcation datasets: Web KB, Reuters-21578 and 20 Newsgroups. These datasets, which have already been preprocessed by removing short and stop words, are found in [19] and can be downloaded from the author s website1.
Dataset Splits	No	The paper states: 'Here we randomly chose ﬁve sets of training samples, and used the remaining samples for each of the training sets as the test set.' It describes a training and test split but does not explicitly mention a separate validation set or detail a cross-validation setup for hyperparameter tuning.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using 'LIBSVM' and 'the author's implementation of Med LDA' and 'word2vec', but does not specify version numbers for any of these software dependencies.
Experiment Setup	Yes	In our experiments, we choose the optimal parameters for these methods from the following variations: γ {10 3, 10 2, , 103} in the latent SMM, SVD+SMM, word2vec+SMM and SVM with a Gaussian RBF kernel, C {2 3, 2 1, , 25, 27} in all the methods, regularizer parameter ρ {10 2, 10 1, 100}, latent dimensionality q {2, 3, 4} in the latent SMM, and the latent dimensionality of Med LDA and SVD+SMM ranges {10, 20, , 50}.