Inter and Intra Topic Structure Learning with Word Embeddings

Authors: He Zhao, Lan Du, Wray Buntine, Mingyuan Zhou

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that our model achieves the state-of-the-art performance in terms of perplexity, document classification, and topic quality.
Researcher Affiliation Collaboration 1Faculty of Information Technology, Monash University, Australia 2Mc Combs School of Business, University of Texas at Austin.
Pseudocode No The paper describes the inference steps in paragraph form, but does not include a formal pseudocode block or algorithm box. It notes: "Omitted derivations, details, and the overall algorithm are in the supplementary materials."
Open Source Code Yes 1https://github.com/ethanhezhao/WEDTM
Open Datasets Yes We used a regular text dataset (20NG) and three sparse text datasets (WS, TMN, Twitter)... Twitter, was extracted in 2011 and 2012 microblog tracks at Text REtrieval Conference (TREC)4 and preprocessed in Yin & Wang (2014). 4http://trec.nist.gov/data/microblog.html
Dataset Splits Yes Here we randomly chose a certain proportion of the word tokens in each document as training and used the remaining ones to calculate per-heldout-word perplexity. For all the datasets, we randomly selected 80% documents for training and used the remaining 20% for testing.
Hardware Specification No No specific hardware details (e.g., GPU models, CPU types, memory) used for running experiments are mentioned in the paper.
Software Dependencies No The paper mentions using "50-dimensional GloVe word embeddings pre-trained on Wikipedia" but does not specify any software names with version numbers for replication.
Experiment Setup Yes The hyperparameter settings we used for WEDTM and GBN are a0 = b0 = 0.01, e0 = f0 = 1.0, η0 = 0.05. For Meta LDA and WEI-FTM, we collected 1000 MCMC samples after 1000 burnins; for GBN and WEDTM, we collected 1000 for T = 1 and 500 for T > 1 MCMC samples after 1000 for T = 1 and 500 for T > 1 burnins, to estimate the posterior mean. Due to the shrinkage effect of WEDTM on S, discussed in Section 4, we set S = 5 which is large enough for all the topics.