Dynamic User Profiling for Streams of Short Texts
Authors: Shangsong Liang
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results validate the effectiveness of the proposed algorithms. |
| Researcher Affiliation | Academia | Shangsong Liang Department of Computer Science, University College London, United Kingdom shangsong.liang@ucl.ac.uk |
| Pseudocode | Yes | Algorithm 1: Overview of the proposed SPA algorithm., Algorithm 2: Inference for our UET model at time t., and Algorithm 3: SKDA algorithm to generate top-k keywords for dynamically profiling each user s expertise. |
| Open Source Code | No | The paper mentions 'Embeddings of both regular words and all hashtags are publicly available from https://nlp.stanford.edu/projects/glove/', but this refers to a third-party resource (word embeddings) used by the authors, not their own source code for the methodology described in the paper. |
| Open Datasets | No | The paper states 'In order to answer our research questions, we work with a dataset collected from Twitter.1 The dataset contains 1,375 active users and their tweets that were posted from the beginning of their registration up to May 31, 2015.' and footnote '1Crawled from https://dev.twitter.com/'. This indicates how the data was obtained, but does not provide a direct link, formal citation, or clear statement that the specific collected dataset is publicly available for download. |
| Dataset Splits | Yes | For tuning parameters λ1 and λ2 in (8), we use a 70%/20%/10% split for our training, validation and test sets, respectively. |
| Hardware Specification | No | The paper does not provide specific details regarding the hardware (e.g., CPU, GPU models, memory) used for conducting the experiments. |
| Software Dependencies | No | The paper mentions various models and algorithms but does not specify software dependencies with version numbers (e.g., Python, PyTorch, or specific library versions) that would be needed for replication. |
| Experiment Setup | Yes | For tuning parameters λ1 and λ2 in (8), we use a 70%/20%/10% split for our training, validation and test sets, respectively. In the training we vary the parameters λ1 and λ2 from 0.0 to 1.0. The best parameters are then chosen on the validation set, and evaluated on the test set. The train/validation/test splits are permuted until all users were chosen once for the test set. We repeat the experiments 10 times and report the average results. and We set the number of topics Z = 50 in all the topic models. |