Extracting Topical Phrases from Clinical Documents

Authors: Yulan He

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on patients discharge summaries show that the proposed approach outperforms the state-of-the-art topical phrase extraction model on both perplexity and topic coherence measure and finds more interpretable topics.
Researcher Affiliation Academia Yulan He School of Engineering and Applied Science Aston University, UK y.he@cantab.net
Pseudocode No The paper includes a plate diagram (Figure 2) and describes the generative process textually, but no explicit pseudocode or algorithm block is provided.
Open Source Code No The paper mentions using an 'off-the-shelf tool called Med Tagger' and provides its URL (http://www.ohnlp.org/index.php/Med Tagger), but there is no explicit statement or link for the open-source code of the authors' proposed method (TPM).
Open Datasets Yes We use the clinical record data released as part of the i2b2 Natural Language Processing Challenges for Clinical Records (Uzuner et al. 2010).
Dataset Splits No The paper mentions using '10% of the data as a held-out set' for testing, but does not specify a separate validation set split or detailed split percentages/counts for all data partitions.
Hardware Specification No No specific hardware details (such as GPU or CPU models, or memory specifications) used for running the experiments are mentioned in the paper.
Software Dependencies No The paper mentions software like MALLET and Med Tagger, but does not provide specific version numbers for these or any other key software dependencies needed for replication.
Experiment Setup Yes We train TPM with a maximum of 1,000 Gibbs sampling iterations and stop if the total log-likelihood converges. We optimise all the hyperparameters including α and an 1, bn 1 for different context length n in HPYP every 50 iterations.