On Modelling Non-linear Topical Dependencies
Authors: Zhixing Li, Siqiang Wen, Juanzi Li, Peng Zhang, Jie Tang
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on four data sets show that GTRF achieves much lower perplexity than LDA and linear dependency topic models and produces better topic coherence. |
| Researcher Affiliation | Academia | Zhixing Li ADAM0730@GMAIL.COM Siqiang Wen WENSQ2329@GMAIL.COM Juanzi Li LIJUANZI@TSINGHUA.EDU.CN Peng Zhang ZPJUMPER@GMAIL.COM Jie Tang JIETANG@TSINGHUA.EDU.CN Department of Computer Science, Tsinghua University, Beijing, China |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. The inference and estimation procedures are described in narrative text. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing its source code or a link to a code repository for the GTRF model described. |
| Open Datasets | Yes | Reuters-215781: 1http://www.daviddlewis.com/resources/testcollections/reuters21578/ 20News Groups2: 2http://qwone.com/~jason/20Newsgroups/ NIPS data3 (A. et al., 2007): 3http://ai.stanford.edu/~gal/Data/NIPS/ ICML data: The accepted papers of ICML from 2007 to 2013. |
| Dataset Splits | No | The paper states: "For all datasets, we train models with two thirds of documents and calculate predicative perplexity on the unseen one third of documents." This describes a training and test split, but does not explicitly mention a separate validation set for hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, cloud instances) used to run the experiments. |
| Software Dependencies | No | The paper mentions using "Stanford Parser4 (Marneffe et al., 2006)" but does not provide a specific version number for the parser or any other software dependencies. |
| Experiment Setup | Yes | For all datasets, we train models with two thirds of documents and calculate predicative perplexity on the unseen one third of documents. ... we test all three models on ICML and NIPS data with topic numbers K = 10, 15, 20, 25. For the other two dataset, we test all three models with topic number K = 10, 20, 50, 100. In our GTRF model, there is a control parameter λ2 that can not be estimated directly and we test GTRF with λ2 = 0.2, 0.4, 0.6, 0.8. |