GC-Flow: A Graph-Based Flow Network for Effective Clustering

Authors: Tianchun Wang, Farzaneh Mirzazadeh, Xiang Zhang, Jie Chen

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct a comprehensive set of experiments to evaluate the performance of GC-Flow on graph data and demonstrate that it is competitive with GNNs for classification, while being advantageous in learning representations that extract the clustering structure of the data.
Researcher Affiliation Collaboration 1Pennsylvania State University 2MIT-IBM Watson AI Lab, IBM Research. Correspondence to: Tianchun Wang <tkw5356@psu.edu>, Jie Chen <chenjie@us.ibm.com>.
Pseudocode No The paper describes its methods through text and mathematical equations but does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/xztcwang/GCFlow.
Open Datasets Yes Data sets. We use six benchmark GNN data sets. Data sets Cora, Citeseer, and Pubmed are citation graphs... We follow the predefined splits in Kipf & Welling (2017)... For statistics of the data sets, see Table 5 in Appendix G... All data sets used in the experiments are obtained from Py Torch Geometric.
Dataset Splits Yes We randomly sample 200/1300/1000 nodes for training/validation/testing for Computers and 80/620/1000 for Photo. The data set Wiki-CS is a web graph... We use one of the predefined splits.
Hardware Specification Yes We conduct the experiments on a server with four NVIDIA RTX A6000 GPUs (48GB memory each).
Software Dependencies No We implemented all models using Py Torch (Paszke et al., 2019), Py Torch Geometric (Fey & Lenssen, 2019), and Scikit-learn (Pedregosa et al., 2011). While some citations are provided, explicit version numbers for all key software dependencies are not listed.
Experiment Setup Yes Implementation details. For fair comparison, we run all models on the entire data set under the transductive semi-supervised setting. All models are initialized with Glorot initialization (Glorot & Bengio, 2010) and are trained using the Adam optimizer (Kingma & Ba, 2015)... Hyperparameters. We use grid search to tune the hyperparameters of Flow GMM, GC-Flow, and its variants. The search spaces are listed in the following: Number of flow layers: 2, 4, ..., 20; Number of dense layers in each flow: 6, 10, 14; Hidden size of flow layers: 128, 256, 512, 1024; Weighting parameter λ: 0.01, ..., 0.5; Gaussian mean and covariance scale: [0.5, 10]; Initial learning rate: 0.001, ..., 0.005; Dropout rate: 0.1, ..., 0.6.