Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Finding Communities with Hierarchical Semantics by Distinguishing General and Specialized topics
Authors: Ge Zhang, Di Jin, Jian Gao, Pengfei Jiao, Françoise Fogelman-Soulié, Xin Huang
IJCAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The superiority of our algorithm for community detection is further demonstrated in comparison with eight state-of-the-art algorithms on eight real-world networks. |
| Researcher Affiliation | Academia | School of Computer Science and Technology, Tianjin University, Tianjin, China 2 College of Information Science and Technology, Dalian Maritime University, Dalian, China 3 School of Computer Software, Tianjin University, Tianjin, China 4 Department of Computer Science, Hong Kong Baptist University, Hong Kong, China |
| Pseudocode | Yes | Alg. 1: Process of TLSC |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | Yes | The dataset we use in this case study analysis is the British online music platform Last.fm [Cantador, 2016]. We then evaluate our approach on 8 real networks in comparison with 8 state-of-the-art methods. Table 2: Datasets used. n is the number of nodes, e the number of edges, m the number of attributes, and c the number of communities. [Leskovec, 2016] Texas 187 328 1,703 5 The Web KB network consists of four subnetworks from four American universities, which are Texas, Cornell, Washington and Wisconsin, respectively. Cornell 195 304 1,703 5 Washington 230 446 1,703 5 Wisconsin 265 530 1,703 5 Twitter 171 796 578 7 Largest subnetwork (id 629863) in Twitter data Cite 3,312 4,732 3,703 6 A Citeseer citation network Cora 2,708 5,429 1,433 7 A Cora citation network Pubmed 19,729 44,338 500 3 Publications in Pub Med on diabetes |
| Dataset Splits | No | The paper does not provide specific details regarding training, validation, and test dataset splits (e.g., percentages or counts). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or cloud computing instance types) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., library names with versions). |
| Experiment Setup | Yes | In our algorithm, we set the number of specialized topics and general topics to 1 and 1/2 of the number of communities. We also performed some experiments to vary the number of general topics, and found that highly overlapping general topics will appear when the number of general topics is greater than 4. So we set this number to 4 (E = 4). |