Progressive EM for Latent Tree Models and Hierarchical Topic Detection

Authors: Peixian Chen, Nevin Zhang, Leonard Poon, Zhourong Chen

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical experiments show that our method greatly improves the efficiency of HLTA. It is as efficient as the state-of-the-art LDA-based method for hierarchical topic detection and finds substantially better topics and topic hierarchies. We tested the idea on the New York Times dataset4, which consists of 300,000 articles.
Researcher Affiliation Academia The Hong Kong University of Science and Technology {pchenac,lzhang,zchenbb@cse.ust.hk} The Hong Kong Institute of Education{kmpoon@ied.edu.hk}
Pseudocode Yes Algorithm 1 PEM-HLTA(D, τ, δ, κ)
Open Source Code No The paper provides a link for the HLTA algorithm (prior work) but does not state that the code for the method described in this paper (PEM-HLTA) is open-source or provide a link for it.
Open Datasets Yes Two of the datasets used are NIPS1 and Newsgroup2. 1http://www.cs.nyu.edu/ roweis/data.html 2http://qwone.com/jason/20Newsgroups/ We tested the idea on the New York Times dataset4, which consists of 300,000 articles. 4http://archive.ics.uci.edu/ml/datasets/Bag+of+Words
Dataset Splits No Each dataset was randomly partitioned into a training set with 80% of the data, and a test set with the 20% left. (No explicit mention of a separate validation set for hyperparameter tuning.)
Hardware Specification No All experiments are conducted on the same desktop computer. (No specific hardware details like CPU, GPU, or memory are provided.)
Software Dependencies No PEM-HLTA is implemented in Java. (No specific version numbers for Java or any other software dependencies are provided.)
Experiment Setup Yes In our experiments, we set κ = 50. Based on the cut-off values for the Bayes factor, we set δ = 3 in our experiments.