Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Extracting Topical Phrases from Clinical Documents
Authors: Yulan He
AAAI 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on patients discharge summaries show that the proposed approach outperforms the state-of-the-art topical phrase extraction model on both perplexity and topic coherence measure and ο¬nds more interpretable topics. |
| Researcher Affiliation | Academia | Yulan He School of Engineering and Applied Science Aston University, UK EMAIL |
| Pseudocode | No | The paper includes a plate diagram (Figure 2) and describes the generative process textually, but no explicit pseudocode or algorithm block is provided. |
| Open Source Code | No | The paper mentions using an 'off-the-shelf tool called Med Tagger' and provides its URL (http://www.ohnlp.org/index.php/Med Tagger), but there is no explicit statement or link for the open-source code of the authors' proposed method (TPM). |
| Open Datasets | Yes | We use the clinical record data released as part of the i2b2 Natural Language Processing Challenges for Clinical Records (Uzuner et al. 2010). |
| Dataset Splits | No | The paper mentions using '10% of the data as a held-out set' for testing, but does not specify a separate validation set split or detailed split percentages/counts for all data partitions. |
| Hardware Specification | No | No specific hardware details (such as GPU or CPU models, or memory specifications) used for running the experiments are mentioned in the paper. |
| Software Dependencies | No | The paper mentions software like MALLET and Med Tagger, but does not provide specific version numbers for these or any other key software dependencies needed for replication. |
| Experiment Setup | Yes | We train TPM with a maximum of 1,000 Gibbs sampling iterations and stop if the total log-likelihood converges. We optimise all the hyperparameters including Ξ± and an 1, bn 1 for different context length n in HPYP every 50 iterations. |