Old can be Gold: Better Gradient Flow can Make Vanilla-GCNs Great Again
Authors: AJAY JAISWAL, Peihao Wang, Tianlong Chen, Justin Rousseau, Ying Ding, Zhangyang Wang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide extensive empirical evidence across multiple datasets that our methods improve gradient flow in deep vanilla-GCNs and significantly boost their performance to comfortably compete and outperform many fancy state-of-the-art methods. In this section, we first provide experimental evidence to augment our signal propagation hypothesis and show that our newly proposed methods facilitate healthy gradient flow during the training of deep-vanilla GCNs. Next, we extensively evaluate our methods against state-of-the-art graph neural network models and techniques to improve vanilla-GCNs on on a wide variety of open graph datasets. |
| Researcher Affiliation | Academia | University of Texas at Austin {ajayjaiswal, peihaowang, tianlong.chen, atlaswang}@utexas.edu {justin.rousseau, ying.ding}@austin.utexas.edu |
| Pseudocode | No | The paper describes methods mathematically and textually but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Codes are available at: https://github.com/VITA-Group/Gradient GCN. |
| Open Datasets | Yes | We use three standard citation network datasets Cora, Citeseer, and Pubmed [48] in GNN domain for evaluating our proposed methods against state-of-the-art GNN models and techniques. For our evaluation on Cora, Citeseer, Pubmed, and OGBN-Ar Xiv, we have closely followed the data split settings and metrics reported by the recent benchmark [49]. |
| Dataset Splits | Yes | For our evaluation on Cora, Citeseer, Pubmed, and OGBN-Ar Xiv, we have closely followed the data split settings and metrics reported by the recent benchmark [49]. Figure 3 illustrates the comparison of validation loss and gradient flow in vanilla-GCNs with 2 and 10 layers on Cora, Citeseer, and Pubmed. |
| Hardware Specification | Yes | All experiments on large graph datasets, e.g., OGBN-Ar Xiv, are conducted on single 48G Quadro RTX 8000 GPU, while small graph experiments are completed using a single 16G RTX 5000 GPU. |
| Software Dependencies | No | We have used the basic vanilla-GCN implementation in Py Torch provided by the authors of [1] to incorporate our proposed techniques and show their effectiveness in making traditional GCN comparable/better with SOTA. |
| Experiment Setup | Yes | Table 1: Hyperparameter configuration for our proposed method on representative datasets. {Learning rate, Weight Decay, Hidden dimesnion} {0.005, 5e 4, 64} {0.005, 5e 4, 64} {0.01, 5e 4, 64} {0.005, 0, 256} We use Adam optimizer for our experiments and performed a grid search to tune hyperparameters for our proposed methods and reported our settings in Table 1. For all our experiments, we have trained our modified GCNs for 1500 epochs and 100 independent repetitions following [49] and reported average performances with the standard deviations of the node classification accuracies. |