reproducibilityindex.ai

Location-Sensitive User Profiling Using Crowdsourced Labels

Authors: Wei Niu, James Caverlee, Haokai Lu

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through extensive experiments over a Twitter list dataset, we demonstrate the effectiveness of this location-sensitive user proﬁling.
Researcher Affiliation	Academia	Wei Niu, James Caverlee, Haokai Lu Department of Computer Science and Engineering, Texas A&M University College Station, TX 77840, USA {wei,caverlee,hlu}@cse.tamu.edu
Pseudocode	Yes	Algorithm 1: Mincost Tree Formation and Algorithm 2: Approximation Algorithm are presented in the paper.
Open Source Code	No	The paper does not contain any explicit statement about releasing the source code for their methodology nor provides a link to a code repository.
Open Datasets	No	The paper states: 'We rely on a Twitter list dataset containing 15 million list relationships in which the geo-coordinates of the labelers and users are known (?).' The citation is ambiguous, and no specific link, DOI, or clear statement of public availability for the dataset is provided.
Dataset Splits	Yes	The result reported for every proﬁling experiment in this paper, including baselines, are based on four-fold cross validation and averaged over the nine locations. For each user, the seen tag set Pk(u) is a random 25% of his proﬁle P(u). Then we try to predict tags in the rest 75% unseen tags.
Hardware Specification	No	The paper does not mention any specific hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions 'Rank SVM(?)' and a 'language identiﬁcation package (?)' but does not provide specific version numbers for any software dependencies.
Experiment Setup	Yes	For reproducibility, the number of negative samples, number of iterations, number of user and tag latent factors are set as 200, 80, 20 respectively. Regularization weights are set as 0.02. We apply text processing techniques such as case folding, stopword removal, and noun singularization. We also separate the string pattern like Food Drink into two words food and drink . We use language identiﬁcation package (?) to ﬁlter out non-English tags. To guarantee the informativeness and quality of the tags, we ﬁlter out infrequent tags with fewer than 5 labelers and 10 labelees. A total of 13 features are used for training the model include features introduced above.