On Modeling and Predicting Individual Paper Citation Count over Time

Authors: Shuai Xiao, Junchi Yan, Changsheng Li, Bo Jin, Xiangfeng Wang, Xiaokang Yang, Stephen M. Chu, Hongyuan Zha

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results on the Microsoft Academic Graph data suggests that our model can be useful for both prediction and interpretability. We perform citation count prediction on the real-world dataset: Microsoft Academic Graph [Sinha et al., 2015]. MAPE and accuracy They are given in Fig.3 where each column for the first two rows shows the results for Computer Science papers published in journal, conference, and IJCAI respectively.
Researcher Affiliation Collaboration Shuai Xiao1, Junchi Yan23 , Changsheng Li3, Bo Jin2 Xiangfeng Wang2, Xiaokang Yang1, Stephen M. Chu3, Hongyuan Zha2 1 Shanghai Jiao Tong University 2 East China Normal University 3 IBM Research China
Pseudocode No The paper describes the model learning and prediction process using mathematical equations and textual steps, but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement about releasing source code, nor does it include a link to a code repository.
Open Datasets Yes We perform citation count prediction on the real-world dataset: Microsoft Academic Graph [Sinha et al., 2015] of which the papers are well collected, complete and authorized.
Dataset Splits No The paper states 'we use papers with more than 5 citations during the first 5 years after publication as training data and predict their citations in the next 10 years', but it does not specify any explicit validation set or split percentages for data partitioning.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, memory, or cloud instance types).
Software Dependencies No The paper does not list specific software dependencies with version numbers (e.g., Python, specific libraries, or solvers).
Experiment Setup Yes Similar to the protocol in [Wang et al., 2013; Shen et al., 2014], we use papers with more than 5 citations during the first 5 years after publication as training data and predict their citations in the next 10 years. Accuracy It measures the fraction of papers correctly predicted for a given error tolerance . Hence the accuracy of popularity prediction on N papers is ### cd(t) rd(t) ###. [Shen et al., 2014] set = 0.1 on their dataset. We find in our test, our methods always outperforms regardless and we set = 0.3. By using the sparsity regularization (set λ = 2 in Eq.1), we can select the most important and interpretable features.