Towards Deeper Graph Neural Networks with Differentiable Group Normalization

Authors: Kaixiong Zhou, Xiao Huang, Yuening Li, Daochen Zha, Rui Chen, Xia Hu

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on real-world datasets demonstrate that DGN makes GNN models more robust to over-smoothing and achieves better performance with deeper GNNs.
Researcher Affiliation Collaboration Kaixiong Zhou Texas A&M University zkxiong@tamu.edu; Xiao Huang The Hong Kong Polytechnic University xiaohuang@comp.polyu.edu.hk; Yuening Li Texas A&M University liyuening@tamu.edu; Daochen Zha Texas A&M University daochen.zha@tamu.edu; Rui Chen Samsung Research America rui.chen1@samsung.com; Xia Hu Texas A&M University xiahu@tamu.edu
Pseudocode No The paper does not contain a pseudocode block or an algorithm section explicitly labeled as such.
Open Source Code Yes The implementation of our approaches is publicly available at https://github.com/Kaixiong-Zhou/DGN.
Open Datasets Yes Joining the practice of previous work, we evaluate GNN models by performing the node classification task on four datasets: Cora, Citeseer, Pubmed [24], and Coauthor CS [36].
Dataset Splits No The paper mentions using 'validation' sets (e.g., 'We remove the input features of both validation and test sets in Cora' and 'λ is tuned on validation sets'), but it does not specify the exact percentages or counts for these splits or refer to a specific citation that defines the exact splits for reproducibility.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU or CPU models, memory, or specific cloud instance types.
Software Dependencies No The paper mentions using the Adam optimizer and Glorot algorithm for initialization, but it does not list any specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions).
Experiment Setup Yes We set the number of hidden units to 16 for GCN and GAT models. The number of attention heads in GAT is 1. [...] We train with a maximum of 1000 epochs using the Adam optimizer [37] and early stopping. Weights in GNN models are initialized with Glorot algorithm [38]. We use the following sets of hyperparameters for Citeseer, Cora, Coauthor CS: 0.6 (dropout rate), 5 × 10−4 (L2 regularization), 5 × 10−3 (learning rate), and for Pubmed: 0.6 (dropout rate), 1 × 10−3 (L2 regularization), 1 × 10−2 (learning rate).