Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Spatiotemporal Activity Modeling Under Data Scarcity: A Graph-Regularized Cross-Modal Embedding Approach

Authors: Chao Zhang, Mengxiong Liu, Zhengchao Liu, Carl Yang, Luming Zhang, Jiawei Han

AAAI 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We have empirically evaluated the performance of BRANCHNET, and found that it is capable of effectively transferring knowledge from external sources to learn better spatiotemporal activity models and outperforming strong baseline methods. (...) We have conducted extensive experiments on a number of real-life datasets. Our experimental results show that, compared with state-of-the-art methods, BRANCHNET can better transfer knowledge from relevant external sources and achieves better performance for spatiotemporal activity predictions.
Researcher Affiliation	Collaboration	1University of Illinois at Urbana-Champaign, Urbana, IL, USA 2Emo Kit Tech Co., Ltd., Beijing, China
Pseudocode	No	The paper describes the model and optimization process using equations and prose but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states, 'We implemented our model and other baselines with TensorFlow', providing a general link to the TensorFlow website (https://www.tensorﬂow.org/). However, it does not explicitly state that the authors' own implementation code for the described methodology is open source or provide a direct link to their code repository.
Open Datasets	Yes	We use the geo-tagged social media data from (Zhang et al. 2017b) as target user behavioral data. (...) Word Net: Word Net is a lexical database of English, which groups English words (nouns, verbs, adjectives, etc.) into synonyms. Given a user behavior corpus C, we extract the keywords in C that appear in the Word Net database, and construct an unweighted graph for those keywords. There exists an edge between two keywords if they are synonyms in Word Net. (Miller 1995)
Dataset Splits	Yes	Given a corpus C (i.e., LA or NY), we randomly split C into two different subsets: 80% for model training, and 20% for test.
Hardware Specification	Yes	We implemented our model and other baselines with Tensorﬂow1, and conducted the experiments on a machine with Intel Xeon 2.80GHz CPU using 20 threads.
Software Dependencies	No	The paper mentions using 'Tensorﬂow' but does not specify its version number or any other software dependencies with their versions.
Experiment Setup	Yes	In our experiments, we use the main embedding dimension to 400 for all the methods by default, and set the task-speciﬁc embedding dimensions to 100. When using Adam to learn the embeddings, we set the learning rate to 0.002 and train for 10 epochs. The methods of SEMIEMBED, PLANETOID, and BRANCHNET require transferring knowledge from external sources, and we set the default weight for an auxiliary task λn to 0.1.