Expressivity and Generalization: Fragment-Biases for Molecular GNNs

Authors: Tom Wollschläger, Niklas Kemper, Leon Hetzel, Johanna Sommer, Stephan Günnemann

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show the effectiveness of our model on synthetic and real-world data where we outperform all GNNs on Peptides and have 12% lower error than all GNNs on ZINC and 34% lower error than other fragment-biased models. Furthermore, we show that our model exhibits superior generalization capabilities compared to the latest transformer-based architectures, positioning it as a robust solution for a range of molecular modeling tasks.
Researcher Affiliation Academia 1School of Computation, Information & Technology, Technical University of Munich. 2Munich Data Science Institute, Germany. 3Helmholtz Center for Computational Health, Munich, Germany. Correspondence to: Tom Wollschl ager <t.wollschlaeger@tum.de>, Niklas Kemper <niklas.kemper@tum.de>.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes 1Find our code at cs.cit.tum.de/daml/fragment-biased-gnns/
Open Datasets Yes To evaluate the predictive performance on real-world molecular dataset, we use the long-range peptides benchmark (Dwivedi et al., 2022) and the large-scale molecular benchmark ZINC (Sterling & Irwin, 2015).
Dataset Splits No The paper mentions using 'ZINC-full validation set' in Table 5 and training/test data in Table 4, but it does not provide specific details like percentages, sample counts, or explicit citations for the overall train/validation/test splits used for the main experiments on ZINC and Peptides datasets.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The model has been implemented in Py Torch (Paszke et al., 2019) using the Py Torch Geometric (Fey & Lenssen, 2019) and the Py Torch Lightning (Falcon & The Py Torch Lightning team, 2019) library.
Experiment Setup Yes The hyperparameters of our model for ZINC (10k and full) and peptides (struct and func) can be found in 12. Note that we adhere to the 500K parameter budget. Each experiment is repeated over three different seeds except for the ZINC-full experiment, where we only have a single run because of computational and time limitations.