Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Urban Dreams of Migrants: A Case Study of Migrant Integration in Shanghai
Authors: Yang Yang, Chenhao Tan, Zongtao Liu, Fei Wu, Yueting Zhuang
AAAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To investigate the process of migrant integration, we employ a one-month complete dataset of telecommunication metadata in Shanghai with 54 million users and 698 million call logs. ...Our classifier is able to achieve an F1-score of 0.82 when distinguishing settled migrants from locals... |
| Researcher Affiliation | Academia | Yang Yang, Chenhao Tan, Zongtao Liu, Fei Wu, Yueting Zhuang College of Computer Science and Technology, Zhejiang University, China Department of Computer Science, University of Colorado Boulder, USA EMAIL, EMAIL, EMAIL |
| Pseudocode | No | The paper describes methods in prose, tables, and figures, but no explicit pseudocode or algorithm blocks are provided. |
| Open Source Code | No | The paper does not contain any statement about releasing source code or a link to a code repository. |
| Open Datasets | No | Our dataset contains complete telecommunication records between mobile users using China Telecom in Shanghai, spanning a month from September 3rd, 2016, to September 30th, 2016 (four weeks). The data is provided by China Telecom, the third largest mobile service provider in China. |
| Dataset Splits | No | We randomly draw 50% of users and use their calling logs in week 2 to train the classifier. The remaining data is used to test the classifier (50% of data in week 2, and 100% of data in week 3 and week 4). ...We choose the best ℓ2 penalty coefficient using 5-fold cross-validation in training data. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used for experiments. |
| Software Dependencies | No | The paper states 'We use ℓ2-regularized logistic regression' for the classifier but provides no specific software version numbers for any tools, libraries, or programming languages used. |
| Experiment Setup | Yes | We choose the best ℓ2 penalty coefficient using 5-fold cross-validation in training data. |