Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Coupling Implicit and Explicit Knowledge for Customer Volume Prediction

Authors: Jingyuan Wang, Yating Lin, Junjie Wu, Zhong Wang, Zhang Xiong

AAAI 2017 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The effectiveness of GR-NMF in coupling all-round knowledge is veriﬁed over a real-life outpatient dataset under different scenarios. GR-NMF shows particularly evident advantages to all baselines in location selection with the cold-start challenge. Extensive experiments are conducted on a real-life outpatient dataset obtained from the Shenzhen city of China. The results show that GR-NMF outperforms competitive baselines consistently in various application scenarios with different sampling rates.
Researcher Affiliation	Academia	Jingyuan Wang, Yating Lin, Junjie Wu, * Zhong Wang, Zhang Xiong School of Computer Science and Engineering, Beihang University, Beijing, China School of Economics and Management, Beihang University, Beijing, China Research Institute of Beihang University in Shenzhen, Shenzhen, China Email: EMAIL, *Corresponding author
Pseudocode	No	The paper describes the inference method using mathematical equations and textual descriptions, but it does not include any structured pseudocode or algorithm blocks with explicit labels like 'Pseudocode' or 'Algorithm'.
Open Source Code	No	The paper does not provide any explicit statement about making its source code open, nor does it include a link to a code repository.
Open Datasets	No	The paper states: 'We perform our experiments on an outpatient service data set collected from the public hospital system of Shenzhen, a major city in southern China1.' The footnote 1 links to Wikipedia page for Shenzhen, not the dataset itself. No specific link, DOI, or formal citation is provided for public access to this collected dataset.
Dataset Splits	No	The paper describes its experimental setup where 'sampled elements' (yij=1) are known and 'unsampled elements' (yij=0) are treated as unknown for prediction and evaluation. It mentions varying 'sampling rate from 10% to 50%', which implicitly defines known vs. unknown data. However, it does not explicitly state conventional training, validation, and test splits with specific percentages or sample counts for reproducibility. No distinct validation set is mentioned.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU, memory, cloud platform) used to conduct the experiments.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers, such as programming languages, libraries, or specific solvers used for implementation or experimentation.
Experiment Setup	Yes	To set a proper H, therefore, we take a warmup experiment on a sampled dataset with 10% of all hospitals, and watch the predictive precision of GR-NMF with H varying from 5 to 40. As can be seen from Fig. 1 , when H > 20 the increasing trend of the predictive precision of GR-NMF tends to be ﬂattened. As a result, we set H = 20 as the default setting in the following experiments. The objective function of GR-NMF is min J = Y (X S C) 2 F + α Y (X A k w) 2 F + β Y (A k w S C) 2 F + γ w 2 2 + δ S 1 + ζ C 1 s.t. S 0, C 0, w 0, where α = σ2 X1 σ2 X2 , β = σ2 X1 σ2 W 2 , γ = σ2 X1 σ2 W 1 , δ = σ2 X1 σ2 S , ζ = σ2 X1 σ2 R , which can be well estimated in advance by minimizing J1, J2 and J3 separately. In other words, these parameters are to be set before the optimization of J .