Mitigating Oversmoothing Through Reverse Process of GNNs for Heterophilic Graphs
Authors: Moonjeong Park, Jaeseung Heo, Dongwoo Kim
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through the experiments on heterophilic graph data, where adjacent nodes need to have different representations for successful classification, we show that the reverse process significantly improves the prediction performance in many cases. Additional analysis reveals that the reverse mechanism can mitigate the over-smoothing over hundreds of layers. Our code is available at https://github.com/ ml-postech/reverse-gnn. |
| Researcher Affiliation | Academia | Moon Jeong Park 1 Jaeseung Heo 1 Dongwoo Kim 1 2 1Graduate School of Artificial Intelligence, Pohang University of Science and Technology (POSTECH), Pohang, Republic of Korea 2Computer Science and Engineering, Pohang University of Science and Technology (POSTECH), Pohang, Republic of Korea. |
| Pseudocode | Yes | Algorithm 1 Inverse of GNN via fixed-point iteration |
| Open Source Code | Yes | Our code is available at https://github.com/ ml-postech/reverse-gnn. |
| Open Datasets | Yes | For the node classification task, we utilize a diverse set of datasets to assess our model. For heterophilic data, we explore two Wikipedia graphs, Chameleon and Squirrel, and five additional datasets, Roman-Empire, Amazon Ratings, Minesweeper, Tolokers, and Questions, introduced by Platonov et al. (2023b). ... In the case of homophilic data, our selection includes three citation graphs: Cora, Cite Seer, and Pub Med, along with two Amazon co-purchase graphs, Computers and Photo. The statistics of the datasets are summarized in Appendix B. |
| Dataset Splits | Yes | For the heterophilic datasets, we adopt the experimental setup from Platonov et al. (2023b), which provides ten random train/validation/test splits. ... For the homophilic datasets, we adopt the experimental setup from He et al. (2021), splitting datasets into 60%/20%/20% train/validation/test sets and using ten random splits for averaging results. |
| Hardware Specification | No | The paper does not specify the hardware used for experiments. It mentions 'We trained our models using...' but does not provide any details about GPU models, CPU types, or other specific hardware components. |
| Software Dependencies | No | The paper describes various methods and models (e.g., GRAND, GCN, GAT) and refers to libraries implicitly (e.g., for non-linear activation functions), but it does not specify any software names with their version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions). |
| Experiment Setup | Yes | Validation For all experiments, we set the number of epochs to 1,000 and apply early stopping when there is no performance improvement for 100 consecutive epochs. For GRAND+Re P, we validate the hyperparameters that maximize the validation metric in the following ranges: learning rate [10 5, 10 1], TF , TR [0, 10], d {16, 32, 64, 128, 256, 512}, K [1, 8], d {4, 8, 16, 32, 64, 128}. ... For GCN+Re P and GAT+Re P, we validate the hyperparameters in the following ranges: learning rate [10 5, 10 1], the number of forward and reverse layers LF , LR {1, 2, 4, 8, 16, 32, 64, 128, 256, 512}, dropout probability [0, 0.9] with step size of 0.1, c {0.1, 0.5, 0.9, 0.999, 0.99999}, convergence threshold for fixed point iteration {10 4, 10 5, 10 6}, d {128, 256, 512, 1024, 2048}, and M {8, 16, 32, 64}. We fix the non-linear activation function to Re LU. |