Efficient Community Search over Large Directed Graph: An Augmented Index-based Approach
Authors: Yankai Chen, Jie Zhang, Yixiang Fang, Xin Cao, Irwin King
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on six real large graphs show that our index-based query algorithm is up to two orders of magnitude faster than existing solutions. |
| Researcher Affiliation | Academia | 1Department of Computer Science and Engineering, The Chinese University of Hong Kong 2School of Computer Science and Engineering, Nanyang Technological University 3School of Computer Science and Engineering, The University of New South Wales |
| Pseudocode | Yes | Algorithm 1 Index construction algorithm: Bottom Up. Algorithm 2 Functions of the CUF data structure. Algorithm 3 Process vertices to create tree nodes. |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the described methodology. |
| Open Datasets | Yes | We use six real large directed graphs in Table 1, where d is the average degree of vertices. The Twitter dataset 3 is collected by Kristina Lerman. The eu-2015, arabic, it-2004, sk-2005 and uk-2007 datasets [Boldi et al., 2014] are available in the website4. 3https://www.isi.edu/lerman/downloads/twitter/twitter2010.html 4http://law.di.unimi.it/datasets.php |
| Dataset Splits | No | The paper describes generating sub-datasets by randomly selecting percentages of vertices (e.g., "20%, 40%, 60% and 80% of its vertices and obtain the four subgraphs induced by these vertices") for scalability evaluation, but does not explicitly provide training, validation, or test dataset splits in the conventional sense for model training. |
| Hardware Specification | Yes | We implement all the algorithms in Java, and run the experiments on a Linux machine with Ten-core Intel E74820 V3 CPU@1.90GHz and 300GB memory. |
| Software Dependencies | No | The paper states that algorithms are implemented "in Java", but does not provide specific version numbers for Java or any other software libraries, frameworks, or dependencies used. |
| Experiment Setup | Yes | As suggested in [Fang et al., 2019b], for each sub-dataset, we randomly select 200 vertices which are in the (8,8)-cores as the query vertices and set k=l=8. We depict the effect of k and l on the efficiency of CSD queries in Figure 4(g)-(r). For each dataset, we randomly select 200 vertices within the (8,8)-cores to query. |