On the Generalization of Equivariant Graph Neural Networks
Authors: Rafal Karczewski, Amauri H Souza, Vikas Garg
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on real-world datasets substantiate our analysis, demonstrating a high correlation between theoretical and empirical generalization gaps and the effectiveness of the proposed regularization scheme. and We performed extensive experiments to validate our theory. |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, Aalto University, Finland 2Federal Institute of Cear a (Brazil) 3Yai Yai Ltd. |
| Pseudocode | No | The paper describes the EGNN model's computational steps using mathematical equations but does not present them in a structured pseudocode or algorithm block. |
| Open Source Code | Yes | Our code is available at https://github.com/Aalto-Qu ML/ Generalization EGNNs. |
| Open Datasets | Yes | We consider four molecular property prediction tasks from the QM9 dataset (Ramakrishnan et al., 2014). |
| Dataset Splits | Yes | From the data split available in (Satorras et al., 2021; Anderson et al., 2019), we select a subset of 2K molecules for training and use the entire val/test partitions with approximately 17K and 13K molecules, respectively. |
| Hardware Specification | No | The paper states 'CSC IT Center for Science, Finland, provided computational support for this work' but does not specify any particular hardware models (e.g., GPU, CPU models, or cloud instance types) used for the experiments. |
| Software Dependencies | No | The paper mentions implementing models using 'Py Torch' but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We train all models for 1000 epochs using the Adam optimizer. We use batch size equal to 96 and cosine decay for the learning rate starting at 10 3 except for the Mu (µ) property, where the initial value was set to 5 10 4. We run five independent trials with different seeds. and For the experiments in Figure 3, we use width d = 64 (for all layers) and Legnn = 5 for the experiments regarding the spectral norm, Legnn = 3 for the one regarding the width, and width d = 16 for assessing the generalization gap in terms of the number of layers. We apply ε-normalization with ε = 1. All internal MLPs (ϕh, ϕz, ϕµ, ϕout) have two layers with Si LU activation function. |