GC-Flow: A Graph-Based Flow Network for Effective Clustering
Authors: Tianchun Wang, Farzaneh Mirzazadeh, Xiang Zhang, Jie Chen
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we conduct a comprehensive set of experiments to evaluate the performance of GC-Flow on graph data and demonstrate that it is competitive with GNNs for classification, while being advantageous in learning representations that extract the clustering structure of the data. |
| Researcher Affiliation | Collaboration | 1Pennsylvania State University 2MIT-IBM Watson AI Lab, IBM Research. Correspondence to: Tianchun Wang <tkw5356@psu.edu>, Jie Chen <chenjie@us.ibm.com>. |
| Pseudocode | No | The paper describes its methods through text and mathematical equations but does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/xztcwang/GCFlow. |
| Open Datasets | Yes | Data sets. We use six benchmark GNN data sets. Data sets Cora, Citeseer, and Pubmed are citation graphs... We follow the predefined splits in Kipf & Welling (2017)... For statistics of the data sets, see Table 5 in Appendix G... All data sets used in the experiments are obtained from Py Torch Geometric. |
| Dataset Splits | Yes | We randomly sample 200/1300/1000 nodes for training/validation/testing for Computers and 80/620/1000 for Photo. The data set Wiki-CS is a web graph... We use one of the predefined splits. |
| Hardware Specification | Yes | We conduct the experiments on a server with four NVIDIA RTX A6000 GPUs (48GB memory each). |
| Software Dependencies | No | We implemented all models using Py Torch (Paszke et al., 2019), Py Torch Geometric (Fey & Lenssen, 2019), and Scikit-learn (Pedregosa et al., 2011). While some citations are provided, explicit version numbers for all key software dependencies are not listed. |
| Experiment Setup | Yes | Implementation details. For fair comparison, we run all models on the entire data set under the transductive semi-supervised setting. All models are initialized with Glorot initialization (Glorot & Bengio, 2010) and are trained using the Adam optimizer (Kingma & Ba, 2015)... Hyperparameters. We use grid search to tune the hyperparameters of Flow GMM, GC-Flow, and its variants. The search spaces are listed in the following: Number of flow layers: 2, 4, ..., 20; Number of dense layers in each flow: 6, 10, 14; Hidden size of flow layers: 128, 256, 512, 1024; Weighting parameter λ: 0.01, ..., 0.5; Gaussian mean and covariance scale: [0.5, 10]; Initial learning rate: 0.001, ..., 0.005; Dropout rate: 0.1, ..., 0.6. |