Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Finding Communities with Hierarchical Semantics by Distinguishing General and Specialized topics

Authors: Ge Zhang, Di Jin, Jian Gao, Pengfei Jiao, Françoise Fogelman-Soulié, Xin Huang

IJCAI 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The superiority of our algorithm for community detection is further demonstrated in comparison with eight state-of-the-art algorithms on eight real-world networks.
Researcher Affiliation	Academia	School of Computer Science and Technology, Tianjin University, Tianjin, China 2 College of Information Science and Technology, Dalian Maritime University, Dalian, China 3 School of Computer Software, Tianjin University, Tianjin, China 4 Department of Computer Science, Hong Kong Baptist University, Hong Kong, China
Pseudocode	Yes	Alg. 1: Process of TLSC
Open Source Code	No	The paper does not provide any concrete access to source code for the methodology described.
Open Datasets	Yes	The dataset we use in this case study analysis is the British online music platform Last.fm [Cantador, 2016]. We then evaluate our approach on 8 real networks in comparison with 8 state-of-the-art methods. Table 2: Datasets used. n is the number of nodes, e the number of edges, m the number of attributes, and c the number of communities. [Leskovec, 2016] Texas 187 328 1,703 5 The Web KB network consists of four subnetworks from four American universities, which are Texas, Cornell, Washington and Wisconsin, respectively. Cornell 195 304 1,703 5 Washington 230 446 1,703 5 Wisconsin 265 530 1,703 5 Twitter 171 796 578 7 Largest subnetwork (id 629863) in Twitter data Cite 3,312 4,732 3,703 6 A Citeseer citation network Cora 2,708 5,429 1,433 7 A Cora citation network Pubmed 19,729 44,338 500 3 Publications in Pub Med on diabetes
Dataset Splits	No	The paper does not provide specific details regarding training, validation, and test dataset splits (e.g., percentages or counts).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, or cloud computing instance types) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., library names with versions).
Experiment Setup	Yes	In our algorithm, we set the number of specialized topics and general topics to 1 and 1/2 of the number of communities. We also performed some experiments to vary the number of general topics, and found that highly overlapping general topics will appear when the number of general topics is greater than 4. So we set this number to 4 (E = 4).