reproducibilityindex.ai

The Intelligible and Effective Graph Neural Additive Network

Authors: Maya Bechler-Speicher, Amir Globerson, Ran Gilad-Bachrach

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we evaluate GNAN on real-world graph and node labeling tasks, including large-scale, long-range, and heterophily datasets.3. We compare GNAN to multiple commonly used black-box GNNs including Graph Conv [52], Graph SAGE [30], Graph Isomorphism Network (GIN) [33], the expressive version of the Graph Attention Network (GATv2) [29, 53], the Graph Transformer (GTransformer) [54]. We also evaluate the FSGNN model, which disentangles the node features from the graph structure [35]. The information on the hyper-parameters tuned for each baseline can be found in the Appendix. We used the following common benchmarks: Node labeling tasks Cora, Citeseer, Pub Med, ogb-arxiv [55, 56] are paper citation networks where the goal is to classify papers into one of several topics. The ogb-arxiv dataset is a large-scale network. Cornell [57] & Tolokers [58] are heterophilious datasets. Cornell is a web-link network with the task of classifying nodes into one of five categories. Tolokers dataset is based on the data from the Toloka crowdsourcing platform. The nodes represent tolokers (workers) who have participated in at least one of 13 selected projects. An edge connects two tolokers if they have worked on the same task. The goal is to predict which tolokers have been banned in one of the projects. Node features are based on the worker s profile information and task performance statistics. Graph labeling tasks NCI1, Proteins, Mutagen & PTC [59] are datasets of chemical compounds. In each dataset, the goal is to classify compounds according to some property of interest. Thr µ ,α ,αHOMO [60] datasets are long-range molecular property prediction regression tasks, over the large-scale QM9 molecular dataset. Additional data information, including the data statistics, can be found in the Appendix. Protocol For all tasks, we used existing splits, protocols, and metrics, as commonly used in the literature for each dataset. The complete protocols for each dataset are given in detail in the Appendix. The metrics we report are: For Cornell, Cora, Citeseer, Pub Med, ogb-arxiv, Mutagenicity, PTC, NCI, and Proteins, we report accuracy. For µ, α and αHOMO we report MAE. For Tolokers we report ROC-AUC. For the node labeling tasks, we used the pre-defined splits in the data and followed the common protocols for each dataset. The results are an average of the test set using 5 or 10 random seeds. For the Proteins and NCI1 tasks, we followed the splits and the nested-cross-validation protocol from [61]. The final reported result on these datasets is an average of 30 runs (10-folds and 3 random seeds). For NCI1 and PTC we followed the splits and protocol from [39] and report the average accuracy and std of a 10-fold nested cross-validation. Results The results are presented in Table 1.
Researcher Affiliation	Collaboration	Maya Bechler-Speicher Blavatnik School of Computer Science Tel-Aviv University Amir Globerson Blavatnik School of Computer Science Tel-Aviv University Ran Gilad-Bachrach Department of Bio-Medical Engineering Edmond J. Safra Center for Bioinformatics Tel-Aviv University Now also at Google Research
Pseudocode	No	No pseudocode or algorithm blocks were found.
Open Source Code	Yes	The implementation can be found at https://github.com/mayabechlerspeicher/ Graph-Neural-Additive-Networks---GNAN
Open Datasets	Yes	Node labeling tasks Cora, Citeseer, Pub Med, ogb-arxiv [55, 56] are paper citation networks where the goal is to classify papers into one of several topics. ... Graph labeling tasks NCI1, Proteins, Mutagen & PTC [59] are datasets of chemical compounds.
Dataset Splits	Yes	For the node labeling tasks, we used the pre-defined splits in the data and followed the common protocols for each dataset. The results are an average of the test set using 5 or 10 random seeds. For the Proteins and NCI1 tasks, we followed the splits and the nested-cross-validation protocol from [61]. The final reported result on these datasets is an average of 30 runs (10-folds and 3 random seeds). For NCI1 and PTC we followed the splits and protocol from [39] and report the average accuracy and std of a 10-fold nested cross-validation. ... For the Cornell dataset we used the splits and protocol from [57] and report the test accuracy averaged over 10 runs, using the best hyper-paremeters found on the validation set. ... For the Tolokers dataset, we followed the protocol and pre-defined splits from [58, 67]. The reported result is an average of a 10-fold nested cross-validation. ... For all these datasets we report the test accuracies averaged over 5 runs, using the parameters obtained from the best accuracy on the validation set of Cora.
Hardware Specification	Yes	All experiments ran on an NVIDIA Ge Force RTX 3090 GPU.
Software Dependencies	No	All our baselines are implemented using Py Torch [65] and Py Torch-Geometric [66]. (No specific version numbers mentioned for these software components)
Experiment Setup	Yes	All GNNs (excluded GNAN) use Re LU activations with {3, 5} layers and 64 hidden channels. They were trained with Adam optimizer over 1000 epochs and early on the validation loss with a patient of 100 steps, eight Decay of 1e 4, learning rate in {1e 3, 1e 4}, dropout rate in {0, 0.5}, and a train batch size of 32. In GNAN, all the feature and distance networks use Re LU activations with {3, 5} layers and {64, 32} hidden channels. They were trained with Adam optimizer over 1000 epochs weight decay of 0, 5e 4, learning rate in {1e 2, 1e 3}, dropout rate in {0, 0.6}.