reproducibilityindex.ai

Demystifying Oversmoothing in Attention-Based Graph Neural Networks

Authors: Xinyi Wu, Amir Ajorlou, Zihui Wu, Ali Jadbabaie

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our theoretical results on six real-world datasets with two attention-based GNN architectures and five common nonlinearities. In this section, we validate our theoretical findings via numerical experiments using the three commonly used homophilic benchmark datasets: Cora, Cite Seer, and Pub Med and the three commonly used heterophilic benchmark datasets: Cornell, Texas, and Wisconsin.
Researcher Affiliation	Academia	Xinyi Wu1,2 Amir Ajorlou2 Zihui Wu3 Ali Jadbabaie1,2 1Institute for Data, Systems and Society (IDSS), MIT 2Laboratory for Information and Decision Systems (LIDS), MIT 3Department of Computing and Mathematical Sciences (CMS), Caltech
Pseudocode	No	The paper presents mathematical equations and theoretical derivations, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain an explicit statement or link indicating that the source code for the described methodology is publicly available. It mentions using PyTorch and PyTorch Geometric, which are third-party libraries.
Open Datasets	Yes	We used torch_geometric.datasets.planetoid provided in Py Torch Geometric for the three homophilic datasets: Cora, Cite Seer, and Pub Med with their default training and test splits. We used torch_geometric.datasets.Web KB provided in Py Torch Geometric for the three heterophilic datasets: Cornell, Texas, and Wisconsin with their default training and test splits.
Dataset Splits	Yes	We used torch_geometric.datasets.planetoid provided in Py Torch Geometric for the three homophilic datasets: Cora, Cite Seer, and Pub Med with their default training and test splits. We used torch_geometric.datasets.Web KB provided in Py Torch Geometric for the three heterophilic datasets: Cornell, Texas, and Wisconsin with their default training and test splits.
Hardware Specification	Yes	We trained all of our models on a Telsa V100 GPU.
Software Dependencies	No	All models were implemented with Py Torch [27] and Py Torch Geometric [10].
Experiment Setup	Yes	For each dataset, we trained a 128-layer single-head GAT and a 128-layer GCN with the random walk graph convolution D 1 deg A, each having 32 hidden dimensions and trained using the standard features and splits. In all experiments, we used the Adam optimizer using a learning rate of 0.00001 and 0.0005 weight decay and trained for 1000 epoch