Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Subsampling in Large Graphs Using Ricci Curvature

Authors: Shushan Wu, Huimin Cheng, Jiazhang Cai, Ping Ma, Wenxuan Zhong

ICLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on synthetic and benchmark datasets demonstrate the advantages of our algorithm.
Researcher Affiliation	Academia	Shushan Wu, Huimin Cheng, Jiazhang Cai, Ping Ma & Wenxuan Zhong Department of Statistics University of Georgia Athens, GA 30605, USA EMAIL
Pseudocode	Yes	Algorithm 1 Input: Graph G; Number of nodes to be subsampled n; OR curvature of the graph G calculated by parameter α; the subsampled node set S = Initialization: Randomly choose a node v0 as the start of sampling. Here, t = 0. Cold Start: Among all neighbors of v0, randomly add v1 to S. Here, t = 1. While: \|S\| < n Step 1: Given the edge et = (vt 1, vt) selected in the t-th step, get the edge curvatures of the et s neighboring edges (et). Step 2: Select the edge et+1 in the edge set (et) that has the greatest OGR. Step 3: Add nodes connecting the edge et+1 to the subsampled node set S. t = t + 1. Output: Subsampled node set S and the induced subgraph G[S]
Open Source Code	No	The paper does not provide any explicit statements about releasing source code or links to a code repository for their method.
Open Datasets	Yes	We use five widely-used read-world graph datasets with labeled community structures to validate the performance of our method, including Polbooks, Facebook, Cora, Polblogs, and Pub Med (Rossi & Ahmed, 2015; Rozemberczki et al., 2020a; Sen et al., 2008).
Dataset Splits	No	The paper discusses subsampling proportions and replications for estimation but does not describe traditional training, validation, or test dataset splits in the context of model training or evaluation. The objective is to subsample a graph to preserve properties for estimation.
Hardware Specification	Yes	All the experiments are conducted on a machine with a 40-core NVIDIA Tesla V100 GPU (3.00 GHz).
Software Dependencies	No	We set the hyper-parameters of the seven methods as the default in the package Little Ball of Fur. While a package is named, no specific version number is provided for it or any other software dependency.
Experiment Setup	Yes	We set the community proportion as (3/4, 1/10, 1/12, 1/15) with 900 nodes in total. The out-in-ratio... We set the probability of an edge within a block as 0.8 and vary the probability of an edge between blocks (pout) ({0.06, 0.08, 0.10, 0.12}) and subsampling proportions ({0.1, 0.12, 0.14, 0.16})... The subsampling time r for estimating M is set as 3... The hyper-parameter used to calculate the Ricci curvature of graphs is set as α = 0.5.