Online Bayesian Models for Personal Analytics in Social Media
Authors: Svitlana Volkova, Benjamin Van Durme
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We design a set of classification experiments from three types of data streams including user (U), neighbor (N) and user-neighbor (UN). For all settings we perform 6-fold cross validation and use a balanced prior: 50 users in the train split and 250 users in the test. |
| Researcher Affiliation | Academia | Svitlana Volkova and Benjamin Van Durme Center for Language and Speech Processing, Johns Hopkins University, Baltimore MD 21218, USA svitlana@jhu.edu, vandurme@cs.jhu.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper refers to using the 'LIBLINEAR package integrated in the JERBOA toolkit' which are third-party tools, but does not provide concrete access to the authors' own source code for the methodology described. |
| Open Datasets | Yes | We rely on a dataset previously used for political affiliation classification by (Pennacchiotti and Popescu 2011), then (Zamal, Liu, and Ruths 2012) and (Volkova, Coppersmith, and Van Dume 2014).1 The original Twitter users with their political labels extracted from http://www.wefollow.com as described by (Pennacchiotti and Popescu 2011). |
| Dataset Splits | Yes | For all settings we perform 6-fold cross validation and use a balanced prior: 50 users in the train split and 250 users in the test. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU/CPU models, cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions using the 'LIBLINEAR package integrated in the JERBOA toolkit' but does not specify version numbers for these software components. |
| Experiment Setup | Yes | We design a set of classification experiments from three types of data streams including user (U), neighbor (N) and user-neighbor (UN). For all settings we perform 6-fold cross validation and use a balanced prior: 50 users in the train split and 250 users in the test. The log-linear models with dynamic Bayesian updates defined in Eq.1 and Eq.3 are learned using binary word unigram features extracted from user or neighbor content. |