Break the Ceiling: Stronger Multi-scale Deep Graph Convolutional Networks

Authors: Sitao Luan, Mingde Zhao, Xiao-Wen Chang, Doina Precup

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental For empirical validation, we test di erent instances of the proposed architectures on multiple node classification tasks. The results show that even the simplest instance of the architectures achieves state-of-the-art performance, and the complex ones achieve surprisingly higher performance, with or without validation sets.
Researcher Affiliation Collaboration Sitao Luan1,2, , Mingde Zhao1,2, , Xiao-Wen Chang1, Doina Precup1,2,3 {sitao.luan@mail, mingde.zhao@mail, chang@cs, dprecup@cs}.mcgill.ca 1Mc Gill University; 2Mila; 3Deep Mind
Pseudocode Yes Table 1: Algorithms in Matrix and Nodewise Forms presents structured descriptions of algorithms like Message Passing, Graph SAGE-GCN, Snowball, and Truncated Krylov in nodewise and matrix forms.
Open Source Code Yes Source code to be found at https://github.com/PwnerHarry/Stronger_GCN
Open Datasets Yes The test cases include on public splits [37, 25] of Cora, Citeseer and Pub Med2
Dataset Splits Yes The test cases include on public splits [37, 25] of Cora, Citeseer and Pub Med2, as well as the crafted smaller splits that are more difficult [25, 21, 31]. We compare the instances against several methods under 2 experimental settings, with or without validations sets.
Hardware Specification No The authors wish to express sincere gratitude for the computational resources of Compute Canada provided by Mila. However, no specific hardware models (e.g., GPU, CPU) or detailed specifications are mentioned.
Software Dependencies No The paper mentions 'optimizers RMSprop or Adam' but does not specify version numbers for these or any other software dependencies like libraries or programming languages.
Experiment Setup Yes These hyperparameters are reported in the appendix, which include learning rate and weight decay for the optimizers RMSprop or Adam for cases with validation or without validation, respectively, taking values in the intervals [10 6, 5 10 3] and [10 5, 10 2], respectively, width of hidden layers taking value in the set {100, 200, , 5000}, number of hidden layers in the set {1, 2, . . . , 50}, dropout in (0, 0.99], and the number of Krylov blocks taking value in {1, 2, . . . , 100}. An early stopping trick is also used to achieve better training. Specifically we terminate the training after 100 update steps of not improving the training loss.