Revisiting Over-smoothing and Over-squashing Using Ollivier-Ricci Curvature

Authors: Khang Nguyen, Nong Minh Hieu, Vinh Duc Nguyen, Nhat Ho, Stanley Osher, Tan Minh Nguyen

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we empirically verify the effectiveness of BORF on a variety of tasks compared to other rewiring alternatives. We seek to demonstrate the potential of curvature-based rewiring methods, and more generally, geometric-aware techniques in improving the performance of GNNs. Our codes for the experiments are available at https://github.com/hieubkvn123/ revisiting-gnn-curvature.
Researcher Affiliation Collaboration 1FPT Software AI Center, Vietnam 2Faculty of Mathematics and Computer Science, University of Science, Vietnam National University Ho Chi Minh City, Vietnam 3Department of Statistics and Data Sciences, University of Texas at Austin, USA 4Department of Mathematics, University of California, Los Angeles, USA 5Department of Mathematics, National University of Singapore, Singapore.
Pseudocode Yes Algorithm 1 Batch Ollivier-Ricci Flow (BORF) Input: graph G = (V, E), # rewiring batches N, # edges added per batch h, # edges removed per batch k for i = 1 to N do Find h edges (u1, v1), . . . , (uh, vh) with minimal Ollivier-Ricci curvature κ, along with each summand πj(p, q)d(p, q) in their optimal transportation cost sum for all p, q V and j = 1, h Find k edges (u1, v1), . . . , (uk, vk) with maximal Ollivier-Ricci curvature κ for j = 1 to h do Add to G the edge (p , q ) given by (p , q ) = argmax πj(p, q)d(p, q) end for Remove edges (u1, v1), . . . , (uk, vk) from G end for
Open Source Code Yes Our codes for the experiments are available at https://github.com/hieubkvn123/ revisiting-gnn-curvature.
Open Datasets Yes Datasets. We conduct our experiments on a range of widely used node classification and graph classification tasks. For node classification, we report our results on the datasets CORA, CITESEER (Yang et al., 2016), TEXAS, CORNELL, WISCONSIN (Pei et al., 2020) and CHAMELEON (Rozemberczki et al., 2021). For graph classification, we validate our method on the following benchmarks: ENZYMES, IMDB, MUTAG and PROTEINS from the TUDataset (Morris et al., 2020). A summary of dataset statistics is available in Appendix F.
Dataset Splits Yes For each graph and node classification experiment, we randomly split the dataset into train, validation and test sets 100 times corresponding to 100 trials. For each trial, the GNN model is trained on the train set using the Adam optimizer and validated using the validation set. The test accuracy corresponding to the best accuracy on the validation is recorded as the test accuracy of the current trial. After all 100 trials are finished, the mean test accuracy and the 95% confidence interval across all trials are computed and recorded in Tables 2 and 3. We also implemented a callback that stops the training process upon no improvement on the validation accuracy for 100 epochs. The train and validation fractions used to split the dataset is specified in Table 9.
Hardware Specification Yes Table 16. Server specifications for conducting all experiments. SERVER ID COMPONENTS SPECIFICATIONS ARCHITECTURE X86 64 OS UBUNTU 20.04.5 LTS X86 64 CPU INTEL I7-10700KF (16) @ 5.100GHZ GPU NVIDIA GEFORCE RTX 2080 TI REV. A RAM 12GB ARCHITECTURE X86 64 OS UBUNTU 20.04.5 LTS X86 64 CPU AMD EPYC 7742 64-CORE GPU NVIDIA A100 TENSOR CORE RAM 40GB
Software Dependencies Yes All experiments were implemented in Python using Py Torch (Paszke et al., 2019), Numpy (Harris et al., 2020), Py G (Py Torch Geometric) (Fey & Lenssen, 2019), POT (Python Optimal Transport) (Flamary et al., 2021) with figures created using Tik Z (Tantau, 2023).
Experiment Setup Yes For graph and node classification, we utilized fixed model architectures with fixed numbers of GNN layers across all datasets. We used 3 GNN layers for node classification and 4 GNN layers for graph classification tasks. All the intermediate GNN layers (except for the input and output layers) have the same number of input and output dimensions specified by the hidden dimensions. After every GNN layer, we also added a drop-out layer with a fixed drop-out probability and an activation layer. The specific hidden dimensions, drop-out probabilities, and final activation layers for both node and graph classification tasks are specified in the architecture settings in Table 8. ... The train and validation fractions used to split the dataset is specified in Table 9.