reproducibilityindex.ai

Extracting Topical Phrases from Clinical Documents

Authors: Yulan He

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on patients discharge summaries show that the proposed approach outperforms the state-of-the-art topical phrase extraction model on both perplexity and topic coherence measure and ﬁnds more interpretable topics.
Researcher Affiliation	Academia	Yulan He School of Engineering and Applied Science Aston University, UK y.he@cantab.net
Pseudocode	No	The paper includes a plate diagram (Figure 2) and describes the generative process textually, but no explicit pseudocode or algorithm block is provided.
Open Source Code	No	The paper mentions using an 'off-the-shelf tool called Med Tagger' and provides its URL (http://www.ohnlp.org/index.php/Med Tagger), but there is no explicit statement or link for the open-source code of the authors' proposed method (TPM).
Open Datasets	Yes	We use the clinical record data released as part of the i2b2 Natural Language Processing Challenges for Clinical Records (Uzuner et al. 2010).
Dataset Splits	No	The paper mentions using '10% of the data as a held-out set' for testing, but does not specify a separate validation set split or detailed split percentages/counts for all data partitions.
Hardware Specification	No	No specific hardware details (such as GPU or CPU models, or memory specifications) used for running the experiments are mentioned in the paper.
Software Dependencies	No	The paper mentions software like MALLET and Med Tagger, but does not provide specific version numbers for these or any other key software dependencies needed for replication.
Experiment Setup	Yes	We train TPM with a maximum of 1,000 Gibbs sampling iterations and stop if the total log-likelihood converges. We optimise all the hyperparameters including α and an 1, bn 1 for different context length n in HPYP every 50 iterations.