reproducibilityindex.ai

Lifetime Lexical Variation in Social Media

Authors: Lizi Liao, Jing Jiang, Ying Ding, Heyan Huang, Ee-Peng Lim

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluation shows that our model can learn meaningful age-speciﬁc topics such as school for teenagers and health for older people. Our model can also be used for age prediction and performs better than a number of baseline methods. Experiments This section presents the empirical evaluation of our model.
Researcher Affiliation	Academia	Lizi Liao School of Computer Science Beijing Institute of Technology liaolizi.llz@gmail.com Jing Jiang School of Information System Singapore Management University jingjiang@smu.edu.sg Ying Ding School of Information System Singapore Management University ying.ding.2011@phdis.smu.edu.sg Heyan Huang School of Computer Science Beijing Institute of Technology hhy63@bit.edu.cn Ee-Peng Lim School of Information System Singapore Management University eplim@smu.edu.sg
Pseudocode	No	The paper describes the Gibbs-EM algorithm using mathematical formulas and textual explanations for the E-step and M-step, but it does not provide a structured pseudocode block or a clearly labeled "Algorithm" figure.
Open Source Code	No	The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	No	Our experiments are based on Twitter. We used the following strategy to crawl Twitter users. Starting from a set of 59 popular seed users in Singapore, we ﬁrst crawled these users direct followers and followees and then crawled their followers/followees followers and followees... Finally, we got 16,017 users tweets and age information. The paper describes collecting its own dataset from Twitter users but does not provide any specific access information (link, citation, or repository) for public availability.
Dataset Splits	No	In our age prediction experiments, we randomly selected 150 users from the 1564 users as our test data. For training, a set of users with their tweets and age information are used. around 10% of the users are used for testing and the rest are used for training. The paper specifies a test set and a training set but does not explicitly mention a separate validation set or its specific split.
Hardware Specification	No	The paper does not provide any specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments.
Software Dependencies	No	The paper refers to various models and algorithms like "standard LDA", "Gibbs-EM algorithm", and "Support Vector Regression (SVR)" implemented with "Liblinear", but it does not provide specific version numbers for any software or libraries used in the experiments.
Experiment Setup	Yes	In our experiments, we set α to 0.25 and β to 0.2. We empirically choose 200 topics. We run 32 iterations of Gibbs EM, where during each iteration in the E-step we run 400 iterations of Gibbs sampling.