A Novel Neural Topic Model and Its Supervised Extension

Authors: Ziqiang Cao, Sujian Li, Yang Liu, Wenjie Li, Heng Ji

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments show that our models are competitive in both topic discovery and classification/regression tasks.
Researcher Affiliation Academia Ziqiang Cao1 Sujian Li1 Yang Liu1 Wenjie Li2 Heng Ji3 1Key Laboratory of Computational Linguistics, Peking University, MOE, China 2Computing Department, Hong Kong Polytechnic University, Hong Kong 3Computer Science Department, Rensselaer Polytechnic Institute, USA
Pseudocode Yes Algorithm 1 Training Algorithm for NTM(s NTM)
Open Source Code No The paper only provides a link to `word2vec` (https://code.google.com/p/word2vec/), which is a third-party tool used, not the source code for the methodology described in the paper.
Open Datasets Yes Here, three datasets, namely 20 Newsgroups2, Wiki10+ (Zubiaga 2012) and Movie review data (Pang and Lee 2005), are used. 2http://qwone.com/ jason/20Newsgroups/
Dataset Splits No Table 1 describes the experimental data, including the size of the training part and test part, the average length in words per document, and task. (No explicit validation set or split is mentioned.)
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments, only general statements about the experimental setup.
Software Dependencies No The paper mentions using 'word2vec' but does not provide any version number for it or any other software dependencies.
Experiment Setup Yes The learning rate is set to 0.01 and the regularization factor is set to 0.001. For testing, we need to perform an inference step to re-compute a proper W1 for new documents. This is similar to the training process of NTM while the remaining parameters are fixed. The topic numbers of these models are all set to 100.