Subsampling in Large Graphs Using Ricci Curvature

Authors: Shushan Wu, Huimin Cheng, Jiazhang Cai, Ping Ma, Wenxuan Zhong

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on synthetic and benchmark datasets demonstrate the advantages of our algorithm.
Researcher Affiliation Academia Shushan Wu, Huimin Cheng, Jiazhang Cai, Ping Ma & Wenxuan Zhong Department of Statistics University of Georgia Athens, GA 30605, USA {Shushan.Wu,huimincheng,jc27544,pingma,wenxuan}@uga.edu
Pseudocode Yes Algorithm 1 Input: Graph G; Number of nodes to be subsampled n; OR curvature of the graph G calculated by parameter α; the subsampled node set S = Initialization: Randomly choose a node v0 as the start of sampling. Here, t = 0. Cold Start: Among all neighbors of v0, randomly add v1 to S. Here, t = 1. While: |S| < n Step 1: Given the edge et = (vt 1, vt) selected in the t-th step, get the edge curvatures of the et s neighboring edges (et). Step 2: Select the edge et+1 in the edge set (et) that has the greatest OGR. Step 3: Add nodes connecting the edge et+1 to the subsampled node set S. t = t + 1. Output: Subsampled node set S and the induced subgraph G[S]
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository for their method.
Open Datasets Yes We use five widely-used read-world graph datasets with labeled community structures to validate the performance of our method, including Polbooks, Facebook, Cora, Polblogs, and Pub Med (Rossi & Ahmed, 2015; Rozemberczki et al., 2020a; Sen et al., 2008).
Dataset Splits No The paper discusses subsampling proportions and replications for estimation but does not describe traditional training, validation, or test dataset splits in the context of model training or evaluation. The objective is to subsample a graph to preserve properties for estimation.
Hardware Specification Yes All the experiments are conducted on a machine with a 40-core NVIDIA Tesla V100 GPU (3.00 GHz).
Software Dependencies No We set the hyper-parameters of the seven methods as the default in the package Little Ball of Fur. While a package is named, no specific version number is provided for it or any other software dependency.
Experiment Setup Yes We set the community proportion as (3/4, 1/10, 1/12, 1/15) with 900 nodes in total. The out-in-ratio... We set the probability of an edge within a block as 0.8 and vary the probability of an edge between blocks (pout) ({0.06, 0.08, 0.10, 0.12}) and subsampling proportions ({0.1, 0.12, 0.14, 0.16})... The subsampling time r for estimating M is set as 3... The hyper-parameter used to calculate the Ricci curvature of graphs is set as α = 0.5.