A Probabilistic Model for Bursty Topic Discovery in Microblogs

Authors: Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Jun Xu, Xueqi Cheng

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on a standard Twitter dataset show that our approach outperforms the state-of-the-art baselines significantly.
Researcher Affiliation Academia Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Jun Xu, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Science
Pseudocode Yes Algorithm 1: Gibbs sampling algorithm for BBTM
Open Source Code No The paper states 'all the codes are implemented in C++' but does not provide any link or explicit statement about the public availability of their source code.
Open Datasets Yes We use a standard microblog dataset, i.e., the Tweets2011 collection published in TREC 2011 microblog track2. The dataset contains approximately 16 million tweets sampled in 17 days from Jan. 23 to Feb. 8, 2011. ... 2http://trec.nist.gov/data/tweets/
Dataset Splits No The paper describes the dataset used (Tweets2011) and that the length of a time slice is set to a day, but it does not specify explicit train, validation, or test dataset splits in terms of percentages or sample counts for model training/evaluation.
Hardware Specification Yes The experiments are conducted on a personal computer with two Dual-core 2.6GHz Intel processors and 4 GB of RAM, and all the codes are implemented in C++.
Software Dependencies No The paper only mentions that 'all the codes are implemented in C++', without specifying any software libraries or their version numbers.
Experiment Setup Yes Following the convention in BTM (Yan et al. 2013), we set α = 50/K and β = 0.01 in BBTM. The number of bursty topics K are varied from 10 to 50.