Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Urban Dreams of Migrants: A Case Study of Migrant Integration in Shanghai

Authors: Yang Yang, Chenhao Tan, Zongtao Liu, Fei Wu, Yueting Zhuang

AAAI 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To investigate the process of migrant integration, we employ a one-month complete dataset of telecommunication metadata in Shanghai with 54 million users and 698 million call logs. ...Our classiﬁer is able to achieve an F1-score of 0.82 when distinguishing settled migrants from locals...
Researcher Affiliation	Academia	Yang Yang, Chenhao Tan, Zongtao Liu, Fei Wu, Yueting Zhuang College of Computer Science and Technology, Zhejiang University, China Department of Computer Science, University of Colorado Boulder, USA EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes methods in prose, tables, and figures, but no explicit pseudocode or algorithm blocks are provided.
Open Source Code	No	The paper does not contain any statement about releasing source code or a link to a code repository.
Open Datasets	No	Our dataset contains complete telecommunication records between mobile users using China Telecom in Shanghai, spanning a month from September 3rd, 2016, to September 30th, 2016 (four weeks). The data is provided by China Telecom, the third largest mobile service provider in China.
Dataset Splits	No	We randomly draw 50% of users and use their calling logs in week 2 to train the classiﬁer. The remaining data is used to test the classiﬁer (50% of data in week 2, and 100% of data in week 3 and week 4). ...We choose the best ℓ2 penalty coefﬁcient using 5-fold cross-validation in training data.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for experiments.
Software Dependencies	No	The paper states 'We use ℓ2-regularized logistic regression' for the classifier but provides no specific software version numbers for any tools, libraries, or programming languages used.
Experiment Setup	Yes	We choose the best ℓ2 penalty coefﬁcient using 5-fold cross-validation in training data.