Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets
Authors: Dominique Beaini, Shenyang Huang, Joao Alex Cunha, Zhiyi Li, Gabriela Moisescu-Pareja, Oleksandr Dymov, Samuel Maddrell-Mander, Callum McLean, Frederik Wenkel, Luis Müller, Jama Hussein Mohamud, Ali Parviz, Michael Craig, Michał Koziarski, Jiarui Lu, Zhaocheng Zhu, Cristian Gabellini, Kerstin Klaser, Josef Dean, Cas Wognum, Maciej Sypetkowski, Guillaume Rabusseau, Reihaneh Rabbany, Jian Tang, Christopher Morris, Mirco Ravanelli, Guy Wolf, Prudencio Tossou, Hadrien Mary, Therence Bois, Andrew W Fitzgibbon, Blazej Banaszewski, Chad Martin, Dominic Masters
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we present a range of baseline results as a starting point of multi-task and multi-level training on these datasets. (Abstract) and 4 EXPERIMENTS ON BASELINE MODELS (Section 4 title). |
| Researcher Affiliation | Collaboration | 1Mila Québec AI Institute 2Valence Labs 3Université de Montréal 4Mc Gill University 5Graphcore 6New Jersey Institute of Technology 7RWTH Aachen University 8HEC Montréal 9CIFAR AI Chair |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The Graphium library code can be accessed on Github: https://github.com/datamol-io/graphium. This code allows fellow researchers to reproduce and build upon our experimental results. |
| Open Datasets | Yes | Download links for the Toy Mix, Large Mix and Ultra Large datasets are found in Zenodo Part 1(https://zenodo.org/records/7998401) and Part 2(https://zenodo.org/ records/8370547). |
| Dataset Splits | Yes | Train/Validation/Test Splits for all the datasets in TOYMIX are split randomly with a ratio of 0.8/0.1/0.1. and Train/validation/test/test_seen Splits For the PCQM4M_G25_N4, we create a 0.92/0.04/0.04 split. For the remaining parts, we split randomly with a ratio of 0.92/0.04/0.04. |
| Hardware Specification | Yes | Hardware Baseline results were trained on a Bow Pod-16 IPU system, made of 16 Bow IPU accelerators, with some additional results obtained using A100 and V100 GPUs. |
| Software Dependencies | No | The paper mentions various graph learning libraries such as PyTorch Geometric, DGL, Jraph, tf_geometric, Stellar Graph, Graph Nets, Cog DL, Graph Gym, Torch Drug, Deep Chem, and Graph GPS, as well as the Poplar software stack, but does not provide specific version numbers for any of these software dependencies. |
| Experiment Setup | Yes | For all baseline models, the GNN module uses 4 GNN layers with a hidden size chosen to approximately balance the number of trainable parameters across models (reported in Tab. 2). ... For these cases we fix both the learning rate and total number of epochs (300) used. (Section 4.1) and Table 8: Hyperparameters for the MPNN++ model in the scaling law experiment. which lists specific values like GNN depth 16, dropout 0.01, batch size 8192, learning rate lr = 0.008 with 5 epoch warm up, optimizer Adam, epochs 100. |