reproducibilityindex.ai

A Probabilistic Model for Bursty Topic Discovery in Microblogs

Authors: Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Jun Xu, Xueqi Cheng

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on a standard Twitter dataset show that our approach outperforms the state-of-the-art baselines signiﬁcantly.
Researcher Affiliation	Academia	Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Jun Xu, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Science
Pseudocode	Yes	Algorithm 1: Gibbs sampling algorithm for BBTM
Open Source Code	No	The paper states 'all the codes are implemented in C++' but does not provide any link or explicit statement about the public availability of their source code.
Open Datasets	Yes	We use a standard microblog dataset, i.e., the Tweets2011 collection published in TREC 2011 microblog track2. The dataset contains approximately 16 million tweets sampled in 17 days from Jan. 23 to Feb. 8, 2011. ... 2http://trec.nist.gov/data/tweets/
Dataset Splits	No	The paper describes the dataset used (Tweets2011) and that the length of a time slice is set to a day, but it does not specify explicit train, validation, or test dataset splits in terms of percentages or sample counts for model training/evaluation.
Hardware Specification	Yes	The experiments are conducted on a personal computer with two Dual-core 2.6GHz Intel processors and 4 GB of RAM, and all the codes are implemented in C++.
Software Dependencies	No	The paper only mentions that 'all the codes are implemented in C++', without specifying any software libraries or their version numbers.
Experiment Setup	Yes	Following the convention in BTM (Yan et al. 2013), we set α = 50/K and β = 0.01 in BBTM. The number of bursty topics K are varied from 10 to 50.