A Probabilistic Model for Bursty Topic Discovery in Microblogs
Authors: Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Jun Xu, Xueqi Cheng
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on a standard Twitter dataset show that our approach outperforms the state-of-the-art baselines significantly. |
| Researcher Affiliation | Academia | Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Jun Xu, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Science |
| Pseudocode | Yes | Algorithm 1: Gibbs sampling algorithm for BBTM |
| Open Source Code | No | The paper states 'all the codes are implemented in C++' but does not provide any link or explicit statement about the public availability of their source code. |
| Open Datasets | Yes | We use a standard microblog dataset, i.e., the Tweets2011 collection published in TREC 2011 microblog track2. The dataset contains approximately 16 million tweets sampled in 17 days from Jan. 23 to Feb. 8, 2011. ... 2http://trec.nist.gov/data/tweets/ |
| Dataset Splits | No | The paper describes the dataset used (Tweets2011) and that the length of a time slice is set to a day, but it does not specify explicit train, validation, or test dataset splits in terms of percentages or sample counts for model training/evaluation. |
| Hardware Specification | Yes | The experiments are conducted on a personal computer with two Dual-core 2.6GHz Intel processors and 4 GB of RAM, and all the codes are implemented in C++. |
| Software Dependencies | No | The paper only mentions that 'all the codes are implemented in C++', without specifying any software libraries or their version numbers. |
| Experiment Setup | Yes | Following the convention in BTM (Yan et al. 2013), we set α = 50/K and β = 0.01 in BBTM. The number of bursty topics K are varied from 10 to 50. |