Lifetime Lexical Variation in Social Media
Authors: Lizi Liao, Jing Jiang, Ying Ding, Heyan Huang, Ee-Peng Lim
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical evaluation shows that our model can learn meaningful age-specific topics such as school for teenagers and health for older people. Our model can also be used for age prediction and performs better than a number of baseline methods. Experiments This section presents the empirical evaluation of our model. |
| Researcher Affiliation | Academia | Lizi Liao School of Computer Science Beijing Institute of Technology liaolizi.llz@gmail.com Jing Jiang School of Information System Singapore Management University jingjiang@smu.edu.sg Ying Ding School of Information System Singapore Management University ying.ding.2011@phdis.smu.edu.sg Heyan Huang School of Computer Science Beijing Institute of Technology hhy63@bit.edu.cn Ee-Peng Lim School of Information System Singapore Management University eplim@smu.edu.sg |
| Pseudocode | No | The paper describes the Gibbs-EM algorithm using mathematical formulas and textual explanations for the E-step and M-step, but it does not provide a structured pseudocode block or a clearly labeled "Algorithm" figure. |
| Open Source Code | No | The paper does not contain any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | No | Our experiments are based on Twitter. We used the following strategy to crawl Twitter users. Starting from a set of 59 popular seed users in Singapore, we first crawled these users direct followers and followees and then crawled their followers/followees followers and followees... Finally, we got 16,017 users tweets and age information. The paper describes collecting its own dataset from Twitter users but does not provide any specific access information (link, citation, or repository) for public availability. |
| Dataset Splits | No | In our age prediction experiments, we randomly selected 150 users from the 1564 users as our test data. For training, a set of users with their tweets and age information are used. around 10% of the users are used for testing and the rest are used for training. The paper specifies a test set and a training set but does not explicitly mention a separate validation set or its specific split. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper refers to various models and algorithms like "standard LDA", "Gibbs-EM algorithm", and "Support Vector Regression (SVR)" implemented with "Liblinear", but it does not provide specific version numbers for any software or libraries used in the experiments. |
| Experiment Setup | Yes | In our experiments, we set α to 0.25 and β to 0.2. We empirically choose 200 topics. We run 32 iterations of Gibbs EM, where during each iteration in the E-step we run 400 iterations of Gibbs sampling. |