Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
On the Scalability of GNNs for Molecular Graphs
Authors: Maciej Sypetkowski, Frederik Wenkel, Farimah Poursafaei, Nia Dickson, Karush Suri, Philip Fradkin, Dominique Beaini
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Specifically, we analyze message-passing networks, graph Transformers, and hybrid architectures on the largest public collection of 2D molecular graphs for supervised pretraining. For the first time, we observe that GNNs benefit tremendously from the increasing scale of depth, width, number of molecules and associated labels. |
| Researcher Affiliation | Collaboration | Maciej Sypetkowski Valence Labs, Montreal EMAIL Frederik Wenkel Valence Labs, Montreal Université de Montréal, Mila Quebec EMAIL Farimah Poursafaei Valence Labs, Montreal Mc Gill University, Mila Quebec Nia Dickson NVIDIA Corporation Karush Suri Valence Labs, Montreal Philip Fradkin Valence Labs, Montreal University of Toronto, Vector Institute Dominique Beaini Valence Labs, Montreal Université de Montréal, Mila Quebec |
| Pseudocode | No | The paper presents mathematical equations for the architectures, but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | For pretraining, we use datasets and code from the literature [7]. The code can be found at https://github.com/datamol-io/graphium |
| Open Datasets | Yes | For pretraining, we use datasets and code from the literature [7]. The code can be found at https://github.com/datamol-io/graphium, while the data can be found at https://zenodo.org/records/10797794. |
| Dataset Splits | Yes | The models are tested in 2 different settings: (1) randomly split train and test sets for pretraining and (2) finetuning/probing of pretrained models on standard benchmarks. |
| Hardware Specification | Yes | We used multi-gpu training (with up to 8 NVIDIA A100-SXM4-40GB GPUs) and gradient accumulation, while adjusting batch size to keep the effective batch size constant. Most models were trained on sigle gpus but our 300M and 1B parameter models used 4 and 8 gpus, respectively. |
| Software Dependencies | No | The paper mentions using Adam optimizer but does not specify versions for any programming languages, libraries (e.g., PyTorch, TensorFlow), or other software dependencies. |
| Experiment Setup | Yes | All models use 2-layer MLPs to encode node and edge features, respectively, followed by the core model of 16 layers of the MPNN++, Transformer or GPS++ (except for when scaling depth). ... Further, all layers use layer norm and dropout with p = 0.1. ... Our base MPNN++, Transformer and hybrid GPS++ models are trained using Adam with a base learning rate of 0.003, 0.001, and 0.001, respectively. We use 5 warm-up epochs followed by linear learning rate decay. All pretraining has been conducted with a batch size of 1024. |