GRANOLA: Adaptive Normalization for Graph Neural Networks
Authors: Moshe Eliasof, Beatrice Bevilacqua, Carola-Bibiane Schönlieb, Haggai Maron
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We provide theoretical results that support our design choices as well as an extensive empirical evaluation demonstrating the superior performance of GRANOLA over existing normalization techniques. |
| Researcher Affiliation | Collaboration | Moshe Eliasof University of Cambridge me532@cam.ac.uk Beatrice Bevilacqua Purdue University bbevilac@purdue.edu Carola-Bibiane Schönlieb University of Cambridge cbs31@cam.ac.uk Haggai Maron Technion & NVIDIA Research hmaron@nvidia.com |
| Pseudocode | Yes | Algorithm 1 GRANOLA Layer |
| Open Source Code | Yes | Our code is available at https://github.com/Moshe Eliasof/GRANOLA. |
| Open Datasets | Yes | We experiment with the ZINC-12K molecular dataset [57, 28, 21]... We test our GRANOLA on the OGB collection [31]... We experimented with popular datasets from the TUD [42] repository. |
| Dataset Splits | Yes | We consider the dataset splits proposed in Dwivedi et al. [21]... We consider the scaffold splits proposed in Hu et al. [31]... For all the experiments with datasets from the TUDatasets repository, we followed the evaluation procedure proposed in Xu et al. [63], consisting of 10-fold cross validation and metric at the best averaged validation accuracy across the folds. |
| Hardware Specification | Yes | We ran our experiments on NVIDIA RTX3090 and RTX4090 GPUs, both having 24GB of memory. ... Specifically, we report the average time per batch measured on a Nvidia RTX-2080 GPU. |
| Software Dependencies | No | The paper states 'We implemented GRANOLA using Pytorch [50] (BSD-style license) and Pytorch-Geometric [24] (MIT license)' but does not provide specific version numbers for these software components, which is required for reproducibility. |
| Experiment Setup | Yes | For all models, we used a batch size tuned in {32, 64, 128}. To optimize the model we use the Adam optimizer with initial learning rate of 0.001, which is decayed by 0.5 every 300 epochs. The maximum number of epochs is set to 500. ... The downstream network is composed of a number of layers in {4, 6}, with an embedding dimension tuned in {32, 64}. |