Identifying Search Keywords for Finding Relevant Social Media Posts

Authors: Shuai Wang, Zhiyuan Chen, Bing Liu, Sherry Emery

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments are carried out on identifying such keywords for five (5) real-life application topics to be used for searching relevant tweets from the Twitter API. The results show that the proposed method is highly effective.
Researcher Affiliation Academia Department of Computer Science Institute for Health Research and Policy University of Illinois at Chicago Chicago, Illinois, USA {shuaiwanghk, czyuanacm}@gmail.com, liub@cs.uic.edu, slemery@uic.edu
Pseudocode Yes Algorithm 1 Initial Ranking Algorithm 2 Re-Ranking
Open Source Code No The paper does not state that the source code for its proposed method is publicly available. It only references the Twitter API.
Open Datasets No The paper mentions collecting data from the Twitter API and using a 'random tweets set RT' but does not provide concrete access information (link, DOI, specific citation with author/year) for these datasets to be publicly available.
Dataset Splits No The paper does not provide specific training/validation/test dataset splits (e.g., percentages, sample counts) for a fixed dataset. It describes an iterative keyword discovery process.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using the 'Twitter-API1', and algorithms like LDA and Page Rank, but does not list specific versions for programming languages, libraries, or software dependencies required to replicate the experiments.
Experiment Setup Yes We use 20000 random tweets as the reference set RT for entropy computation in initial ranking... the smoothing parameter λ in the entropy Equation 1 is set to 1. MIN Freq is set to 5. For efficiency and without being blocked by Twitter, only the top 100 words (SK) are picked up from the initial ranking (Step 3) and passed to re-ranking (Step 4)... In re-ranking, we set the number of returned tweets in T(w) from Twitter for each keyword w in SK to 300.