On Modeling and Predicting Individual Paper Citation Count over Time
Authors: Shuai Xiao, Junchi Yan, Changsheng Li, Bo Jin, Xiangfeng Wang, Xiaokang Yang, Stephen M. Chu, Hongyuan Zha
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results on the Microsoft Academic Graph data suggests that our model can be useful for both prediction and interpretability. We perform citation count prediction on the real-world dataset: Microsoft Academic Graph [Sinha et al., 2015]. MAPE and accuracy They are given in Fig.3 where each column for the first two rows shows the results for Computer Science papers published in journal, conference, and IJCAI respectively. |
| Researcher Affiliation | Collaboration | Shuai Xiao1, Junchi Yan23 , Changsheng Li3, Bo Jin2 Xiangfeng Wang2, Xiaokang Yang1, Stephen M. Chu3, Hongyuan Zha2 1 Shanghai Jiao Tong University 2 East China Normal University 3 IBM Research China |
| Pseudocode | No | The paper describes the model learning and prediction process using mathematical equations and textual steps, but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement about releasing source code, nor does it include a link to a code repository. |
| Open Datasets | Yes | We perform citation count prediction on the real-world dataset: Microsoft Academic Graph [Sinha et al., 2015] of which the papers are well collected, complete and authorized. |
| Dataset Splits | No | The paper states 'we use papers with more than 5 citations during the first 5 years after publication as training data and predict their citations in the next 10 years', but it does not specify any explicit validation set or split percentages for data partitioning. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, memory, or cloud instance types). |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python, specific libraries, or solvers). |
| Experiment Setup | Yes | Similar to the protocol in [Wang et al., 2013; Shen et al., 2014], we use papers with more than 5 citations during the first 5 years after publication as training data and predict their citations in the next 10 years. Accuracy It measures the fraction of papers correctly predicted for a given error tolerance . Hence the accuracy of popularity prediction on N papers is ### cd(t) rd(t) ###. [Shen et al., 2014] set = 0.1 on their dataset. We find in our test, our methods always outperforms regardless and we set = 0.3. By using the sparsity regularization (set λ = 2 in Eq.1), we can select the most important and interpretable features. |