Spatiotemporal Activity Modeling Under Data Scarcity: A Graph-Regularized Cross-Modal Embedding Approach

Authors: Chao Zhang, Mengxiong Liu, Zhengchao Liu, Carl Yang, Luming Zhang, Jiawei Han

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have empirically evaluated the performance of BRANCHNET, and found that it is capable of effectively transferring knowledge from external sources to learn better spatiotemporal activity models and outperforming strong baseline methods. (...) We have conducted extensive experiments on a number of real-life datasets. Our experimental results show that, compared with state-of-the-art methods, BRANCHNET can better transfer knowledge from relevant external sources and achieves better performance for spatiotemporal activity predictions.
Researcher Affiliation Collaboration 1University of Illinois at Urbana-Champaign, Urbana, IL, USA 2Emo Kit Tech Co., Ltd., Beijing, China
Pseudocode No The paper describes the model and optimization process using equations and prose but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper states, 'We implemented our model and other baselines with TensorFlow', providing a general link to the TensorFlow website (https://www.tensorflow.org/). However, it does not explicitly state that the authors' own implementation code for the described methodology is open source or provide a direct link to their code repository.
Open Datasets Yes We use the geo-tagged social media data from (Zhang et al. 2017b) as target user behavioral data. (...) Word Net: Word Net is a lexical database of English, which groups English words (nouns, verbs, adjectives, etc.) into synonyms. Given a user behavior corpus C, we extract the keywords in C that appear in the Word Net database, and construct an unweighted graph for those keywords. There exists an edge between two keywords if they are synonyms in Word Net. (Miller 1995)
Dataset Splits Yes Given a corpus C (i.e., LA or NY), we randomly split C into two different subsets: 80% for model training, and 20% for test.
Hardware Specification Yes We implemented our model and other baselines with Tensorflow1, and conducted the experiments on a machine with Intel Xeon 2.80GHz CPU using 20 threads.
Software Dependencies No The paper mentions using 'Tensorflow' but does not specify its version number or any other software dependencies with their versions.
Experiment Setup Yes In our experiments, we use the main embedding dimension to 400 for all the methods by default, and set the task-specific embedding dimensions to 100. When using Adam to learn the embeddings, we set the learning rate to 0.002 and train for 10 epochs. The methods of SEMIEMBED, PLANETOID, and BRANCHNET require transferring knowledge from external sources, and we set the default weight for an auxiliary task λn to 0.1.