Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Lower Ricci Curvature for Efficient Community Detection

Authors: Yun Jin Park, Didong Li

TMLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Through applications on multiple real-world datasets, including the NCAA football league network, the DBLP collaboration network, the Amazon product co-purchasing network, and the You Tube social network, we demonstrate the efficacy of our method in significantly improving the performance of various community detection algorithms.
Researcher Affiliation	Academia	Yun Jin Park EMAIL Department of Biostatistics The University of North Carolina at Chapel Hill Didong Li EMAIL Department of Biostatistics The University of North Carolina at Chapel Hill
Pseudocode	Yes	Algorithm 1: LRC-based preprocessing algorithm for community detection Input: Raw network data: G = (V, E) Output: Preprocessed network data G = (V, E ) 1 Calculate the LRC for all edges; 2 Fit a Gaussian mixture model with two mixing component to LRCs, obtaining the estimate ˆp(x) = π1N(x; µ1, σ2 1) + π2N(x; µ2, σ2 2), where µ1 < µ2; 3 Find the local minimum β := inf µ1<x<µ2 ˆp(x); 4 Remove all edges with LRCs smaller than β: E := {(ij) E : LRC(ij) β}
Open Source Code	Yes	All codes can be found in https://github.com/parkyunjin/Lower Ricci Curv
Open Datasets	Yes	The four real datasets used in this paper can be downloaded in the following websites: 1. NCAA Football League network: https://websites.umich.edu/~mejn/netdata/ under American College football". The graph is provided in the Graph Modeling Language (.gml) format. 2. DBLP collaboration network: https://snap.stanford.edu/data/com-DBLP.html This graph is represented as Network X objects, provided by the CDlib Python package. 3. Amazon product co-purchasing network: https://snap.stanford.edu/data/com-Amazon.html This graph is represented as Network X objects, provided by the CDlib Python package. 4. You Tube social network: https://snap.stanford.edu/data/com-Youtube.html This graph is represented as Network X objects, provided by the CDlib Python package.
Dataset Splits	No	The paper describes using real-world datasets with "known community structures" or "ground-truth conference groups" for evaluation, but does not specify explicit training/test/validation splits for the datasets, as the community detection algorithms typically operate on the entire network for evaluation against ground truth.
Hardware Specification	No	The paper mentions "runtime (in seconds)" for algorithms in tables but provides no specific details about the hardware (CPU, GPU, memory, etc.) used for these computations.
Software Dependencies	No	All algorithms implemented in this paper are from Python package CDlib (Rossetti et al., 2019). This mentions a package but without a specific version number. No other software dependencies are listed with version numbers.
Experiment Setup	Yes	E.1 Hyperparameters for community detection All algorithms implemented in this paper are from Python package CDlib (Rossetti et al., 2019). The hyperparameters are as follows: 1. NCAA Football League network Label Propagation: NA. Leiden: Initial membership = None, weights= None. Girvan-Newman: Level = 10. Walktrap: NA. 2. DBLP collaboration network Angel: Threshold = 0.5, minimum community size = 3. Ego-networks: Level = 1. K-clique: K = 3. SLPA: t = 20, r = 0.1. [Similar details are provided for Amazon and You Tube networks].