A Novel Neural Topic Model and Its Supervised Extension
Authors: Ziqiang Cao, Sujian Li, Yang Liu, Wenjie Li, Heng Ji
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that our models are competitive in both topic discovery and classification/regression tasks. |
| Researcher Affiliation | Academia | Ziqiang Cao1 Sujian Li1 Yang Liu1 Wenjie Li2 Heng Ji3 1Key Laboratory of Computational Linguistics, Peking University, MOE, China 2Computing Department, Hong Kong Polytechnic University, Hong Kong 3Computer Science Department, Rensselaer Polytechnic Institute, USA |
| Pseudocode | Yes | Algorithm 1 Training Algorithm for NTM(s NTM) |
| Open Source Code | No | The paper only provides a link to `word2vec` (https://code.google.com/p/word2vec/), which is a third-party tool used, not the source code for the methodology described in the paper. |
| Open Datasets | Yes | Here, three datasets, namely 20 Newsgroups2, Wiki10+ (Zubiaga 2012) and Movie review data (Pang and Lee 2005), are used. 2http://qwone.com/ jason/20Newsgroups/ |
| Dataset Splits | No | Table 1 describes the experimental data, including the size of the training part and test part, the average length in words per document, and task. (No explicit validation set or split is mentioned.) |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments, only general statements about the experimental setup. |
| Software Dependencies | No | The paper mentions using 'word2vec' but does not provide any version number for it or any other software dependencies. |
| Experiment Setup | Yes | The learning rate is set to 0.01 and the regularization factor is set to 0.001. For testing, we need to perform an inference step to re-compute a proper W1 for new documents. This is similar to the training process of NTM while the remaining parameters are fixed. The topic numbers of these models are all set to 100. |