reproducibilityindex.ai

GC-Flow: A Graph-Based Flow Network for Effective Clustering

Authors: Tianchun Wang, Farzaneh Mirzazadeh, Xiang Zhang, Jie Chen

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct a comprehensive set of experiments to evaluate the performance of GC-Flow on graph data and demonstrate that it is competitive with GNNs for classiﬁcation, while being advantageous in learning representations that extract the clustering structure of the data.
Researcher Affiliation	Collaboration	1Pennsylvania State University 2MIT-IBM Watson AI Lab, IBM Research. Correspondence to: Tianchun Wang <tkw5356@psu.edu>, Jie Chen <chenjie@us.ibm.com>.
Pseudocode	No	The paper describes its methods through text and mathematical equations but does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code is available at https://github.com/xztcwang/GCFlow.
Open Datasets	Yes	Data sets. We use six benchmark GNN data sets. Data sets Cora, Citeseer, and Pubmed are citation graphs... We follow the predeﬁned splits in Kipf & Welling (2017)... For statistics of the data sets, see Table 5 in Appendix G... All data sets used in the experiments are obtained from Py Torch Geometric.
Dataset Splits	Yes	We randomly sample 200/1300/1000 nodes for training/validation/testing for Computers and 80/620/1000 for Photo. The data set Wiki-CS is a web graph... We use one of the predeﬁned splits.
Hardware Specification	Yes	We conduct the experiments on a server with four NVIDIA RTX A6000 GPUs (48GB memory each).
Software Dependencies	No	We implemented all models using Py Torch (Paszke et al., 2019), Py Torch Geometric (Fey & Lenssen, 2019), and Scikit-learn (Pedregosa et al., 2011). While some citations are provided, explicit version numbers for all key software dependencies are not listed.
Experiment Setup	Yes	Implementation details. For fair comparison, we run all models on the entire data set under the transductive semi-supervised setting. All models are initialized with Glorot initialization (Glorot & Bengio, 2010) and are trained using the Adam optimizer (Kingma & Ba, 2015)... Hyperparameters. We use grid search to tune the hyperparameters of Flow GMM, GC-Flow, and its variants. The search spaces are listed in the following: Number of ﬂow layers: 2, 4, ..., 20; Number of dense layers in each ﬂow: 6, 10, 14; Hidden size of ﬂow layers: 128, 256, 512, 1024; Weighting parameter λ: 0.01, ..., 0.5; Gaussian mean and covariance scale: [0.5, 10]; Initial learning rate: 0.001, ..., 0.005; Dropout rate: 0.1, ..., 0.6.