Demystifying Oversmoothing in Attention-Based Graph Neural Networks
Authors: Xinyi Wu, Amir Ajorlou, Zihui Wu, Ali Jadbabaie
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our theoretical results on six real-world datasets with two attention-based GNN architectures and five common nonlinearities. In this section, we validate our theoretical findings via numerical experiments using the three commonly used homophilic benchmark datasets: Cora, Cite Seer, and Pub Med and the three commonly used heterophilic benchmark datasets: Cornell, Texas, and Wisconsin. |
| Researcher Affiliation | Academia | Xinyi Wu1,2 Amir Ajorlou2 Zihui Wu3 Ali Jadbabaie1,2 1Institute for Data, Systems and Society (IDSS), MIT 2Laboratory for Information and Decision Systems (LIDS), MIT 3Department of Computing and Mathematical Sciences (CMS), Caltech |
| Pseudocode | No | The paper presents mathematical equations and theoretical derivations, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available. It mentions using PyTorch and PyTorch Geometric, which are third-party libraries. |
| Open Datasets | Yes | We used torch_geometric.datasets.planetoid provided in Py Torch Geometric for the three homophilic datasets: Cora, Cite Seer, and Pub Med with their default training and test splits. We used torch_geometric.datasets.Web KB provided in Py Torch Geometric for the three heterophilic datasets: Cornell, Texas, and Wisconsin with their default training and test splits. |
| Dataset Splits | Yes | We used torch_geometric.datasets.planetoid provided in Py Torch Geometric for the three homophilic datasets: Cora, Cite Seer, and Pub Med with their default training and test splits. We used torch_geometric.datasets.Web KB provided in Py Torch Geometric for the three heterophilic datasets: Cornell, Texas, and Wisconsin with their default training and test splits. |
| Hardware Specification | Yes | We trained all of our models on a Telsa V100 GPU. |
| Software Dependencies | No | All models were implemented with Py Torch [27] and Py Torch Geometric [10]. |
| Experiment Setup | Yes | For each dataset, we trained a 128-layer single-head GAT and a 128-layer GCN with the random walk graph convolution D 1 deg A, each having 32 hidden dimensions and trained using the standard features and splits. In all experiments, we used the Adam optimizer using a learning rate of 0.00001 and 0.0005 weight decay and trained for 1000 epoch |