Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Progressive EM for Latent Tree Models and Hierarchical Topic Detection
Authors: Peixian Chen, Nevin Zhang, Leonard Poon, Zhourong Chen
AAAI 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical experiments show that our method greatly improves the efficiency of HLTA. It is as efficient as the state-of-the-art LDA-based method for hierarchical topic detection and finds substantially better topics and topic hierarchies. We tested the idea on the New York Times dataset4, which consists of 300,000 articles. |
| Researcher Affiliation | Academia | The Hong Kong University of Science and Technology {pchenac,lzhang,EMAIL} The Hong Kong Institute of Education{EMAIL} |
| Pseudocode | Yes | Algorithm 1 PEM-HLTA(D, τ, δ, κ) |
| Open Source Code | No | The paper provides a link for the HLTA algorithm (prior work) but does not state that the code for the method described in this paper (PEM-HLTA) is open-source or provide a link for it. |
| Open Datasets | Yes | Two of the datasets used are NIPS1 and Newsgroup2. 1http://www.cs.nyu.edu/ roweis/data.html 2http://qwone.com/jason/20Newsgroups/ We tested the idea on the New York Times dataset4, which consists of 300,000 articles. 4http://archive.ics.uci.edu/ml/datasets/Bag+of+Words |
| Dataset Splits | No | Each dataset was randomly partitioned into a training set with 80% of the data, and a test set with the 20% left. (No explicit mention of a separate validation set for hyperparameter tuning.) |
| Hardware Specification | No | All experiments are conducted on the same desktop computer. (No specific hardware details like CPU, GPU, or memory are provided.) |
| Software Dependencies | No | PEM-HLTA is implemented in Java. (No specific version numbers for Java or any other software dependencies are provided.) |
| Experiment Setup | Yes | In our experiments, we set κ = 50. Based on the cut-off values for the Bayes factor, we set δ = 3 in our experiments. |