Inter and Intra Topic Structure Learning with Word Embeddings
Authors: He Zhao, Lan Du, Wray Buntine, Mingyuan Zhou
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that our model achieves the state-of-the-art performance in terms of perplexity, document classification, and topic quality. |
| Researcher Affiliation | Collaboration | 1Faculty of Information Technology, Monash University, Australia 2Mc Combs School of Business, University of Texas at Austin. |
| Pseudocode | No | The paper describes the inference steps in paragraph form, but does not include a formal pseudocode block or algorithm box. It notes: "Omitted derivations, details, and the overall algorithm are in the supplementary materials." |
| Open Source Code | Yes | 1https://github.com/ethanhezhao/WEDTM |
| Open Datasets | Yes | We used a regular text dataset (20NG) and three sparse text datasets (WS, TMN, Twitter)... Twitter, was extracted in 2011 and 2012 microblog tracks at Text REtrieval Conference (TREC)4 and preprocessed in Yin & Wang (2014). 4http://trec.nist.gov/data/microblog.html |
| Dataset Splits | Yes | Here we randomly chose a certain proportion of the word tokens in each document as training and used the remaining ones to calculate per-heldout-word perplexity. For all the datasets, we randomly selected 80% documents for training and used the remaining 20% for testing. |
| Hardware Specification | No | No specific hardware details (e.g., GPU models, CPU types, memory) used for running experiments are mentioned in the paper. |
| Software Dependencies | No | The paper mentions using "50-dimensional GloVe word embeddings pre-trained on Wikipedia" but does not specify any software names with version numbers for replication. |
| Experiment Setup | Yes | The hyperparameter settings we used for WEDTM and GBN are a0 = b0 = 0.01, e0 = f0 = 1.0, η0 = 0.05. For Meta LDA and WEI-FTM, we collected 1000 MCMC samples after 1000 burnins; for GBN and WEDTM, we collected 1000 for T = 1 and 500 for T > 1 MCMC samples after 1000 for T = 1 and 500 for T > 1 burnins, to estimate the posterior mean. Due to the shrinkage effect of WEDTM on S, discussed in Section 4, we set S = 5 which is large enough for all the topics. |