Sampling Representative Users from Large Social Networks
Authors: Jie Tang, Chenhui Zhang, Keke Cai, Li Zhang, Zhong Su
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on two datasets show that the proposed models for sampling representative users significantly outperform (+6%-23% in terms of Precision@100) several alternative methods using authority or structure information only. The proposed algorithms are also effective in terms of time complexity. Only a few seconds are needed to sampling 300 representative users from a network of 100,000 users. All data and codes are publicly available.1 |
| Researcher Affiliation | Collaboration | Jie Tang , Chenhui Zhang , Keke Cai , Li Zhang , Zhong Su Department of Computer Science and Technology, Tsinghua University Tsinghua National Laboratory for Information Science and Technology (TNList) IBM, China Research Lab jietang@tsinghua.edu.cn, zh.sherlock@gmail.com, {caikeke, lizhang, suzhong}@cn.ibm.com |
| Pseudocode | Yes | Algorithm 1: Approximate algorithm for S3 model. |
| Open Source Code | Yes | All data and codes are publicly available.1 1http://arnetminer.org/repuser/ |
| Open Datasets | Yes | All data and codes are publicly available.1 1http://arnetminer.org/repuser/ |
| Dataset Splits | No | No specific dataset split information (percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for validation was found. The paper describes train and test type evaluation. |
| Hardware Specification | No | No specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments were found. |
| Software Dependencies | No | The paper states 'We implement all the algorithms C++.' but does not provide specific version numbers for C++ or any libraries. |
| Experiment Setup | Yes | Let fl be the total frequency of the l-th keyword (1 l 200). We assign ml = f 0.5 l in the S3 model and λl = f 1 l in the SSD model. The parameter β in the S3 model is set to be β = 0.7, by tuning from 0.1 to 1 with interval 0.1. |