Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
A Probabilistic Model for Bursty Topic Discovery in Microblogs
Authors: Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Jun Xu, Xueqi Cheng
AAAI 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on a standard Twitter dataset show that our approach outperforms the state-of-the-art baselines significantly. |
| Researcher Affiliation | Academia | Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Jun Xu, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Science |
| Pseudocode | Yes | Algorithm 1: Gibbs sampling algorithm for BBTM |
| Open Source Code | No | The paper states 'all the codes are implemented in C++' but does not provide any link or explicit statement about the public availability of their source code. |
| Open Datasets | Yes | We use a standard microblog dataset, i.e., the Tweets2011 collection published in TREC 2011 microblog track2. The dataset contains approximately 16 million tweets sampled in 17 days from Jan. 23 to Feb. 8, 2011. ... 2http://trec.nist.gov/data/tweets/ |
| Dataset Splits | No | The paper describes the dataset used (Tweets2011) and that the length of a time slice is set to a day, but it does not specify explicit train, validation, or test dataset splits in terms of percentages or sample counts for model training/evaluation. |
| Hardware Specification | Yes | The experiments are conducted on a personal computer with two Dual-core 2.6GHz Intel processors and 4 GB of RAM, and all the codes are implemented in C++. |
| Software Dependencies | No | The paper only mentions that 'all the codes are implemented in C++', without specifying any software libraries or their version numbers. |
| Experiment Setup | Yes | Following the convention in BTM (Yan et al. 2013), we set α = 50/K and β = 0.01 in BBTM. The number of bursty topics K are varied from 10 to 50. |