Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Spectral Algorithms for Community Detection in Directed Networks

Authors: Zhe Wang, Yingbin Liang, Pengsheng Ji

JMLR 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct experimental studies to compare the performance of six spectral clustering algorithms, namely, D-SCORE, D-SCOREq, r D-SCORE, r D-SCOREq, o PCA, r PCA, and two likelihood algorithms APL (Amini et al., 2013) and BCPL (Bickel and Chen, 2009b). We compare these eight algorithms on the web blogs data and the experiments on simulated data.
Researcher Affiliation Academia Zhe Wang Department of Electrical and Computer Engineering The Ohio State University Columbus, OH 43202, USA EMAIL Yingbin Liang Department of Electrical and Computer Engineering The Ohio State University Columbus, OH 43202, USA EMAIL Pengsheng Ji Department of Statistics University of Georgia Athens, GA 30602, USA EMAIL
Pseudocode Yes Algorithm 1: D-SCORE( ˆU, ˆV, K) Algorithm 2: D-SCOREq( ˆU, ˆV, K) Algorithm 3: Improved D-SCOREq(K, A) using intersection-with-attachment Algorithm 4: o PCA Algorithm 5: Regularized graph Laplacian
Open Source Code No The paper does not provide explicit links to source code repositories or statements confirming the release of their implementation code for the described methodology. The provided license link (https://creativecommons.org/licenses/by/4.0/) is for the paper itself, not the code.
Open Datasets Yes In this subsection, we apply the above mentioned eight algorithms to the web blogs data introduced in Adamic and Glance (2005). In this subsection, we apply the above mentioned eight algorithms to the email-Eu-core network introduced in Leskovec and Krevl (2014). The email data was collected from a large European research institution, and a directed edge from node i to node j indicates that person i has sent at least one email to person j. Clearly, the email-Eu-core network is also a directed network.
Dataset Splits No In our experiment, we first extract the largest component of the graph, which contains 1222 nodes... We repeat each algorithm on each setting 500 times and take the mean of the total number of misclustered nodes. In this subsection, we apply the above mentioned eight algorithms to the email-Eu-core network introduced in Leskovec and Krevl (2014). The email data was collected from a large European research institution, and a directed edge from node i to node j indicates that person i has sent at least one email to person j. Clearly, the email-Eu-core network is also a directed network. There are many communities in this network, but we extract the top 4 largest communities which contains 297 nodes as the entire graph and 252 nodes in intersection graph. We repeat the experiment 500 times and show the mean error in table 2.
Hardware Specification No The paper does not explicitly describe any specific hardware (e.g., GPU models, CPU types, or cloud computing instances) used for running the experiments or simulations.
Software Dependencies No The paper does not provide specific version numbers for any software libraries, programming languages, or tools used in the implementation of the algorithms or experiments.
Experiment Setup Yes Fix a threshold Tn = log n (used to avoid zero denominator), define the n (K 1) ratio matrices R ˆU and R ˆV, such that for 1 i n, 1 k (K 1)... (Algorithm 1) The regularization parameter τ is usually set as the average degree τ = Pn i,j=1 A(i, j)/n. (Algorithm 5) We repeat each algorithm on each setting 500 times and take the mean of the total number of misclustered nodes.