Equivariant Matrix Function Neural Networks
Authors: Ilyes Batatia, Lars Leon Schaaf, Gabor Csanyi, Christoph Ortner, Felix Andreas Faber
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The MFN architecture achieves stateof-the-art performance in standard graph benchmarks, such as the ZINC and TU datasets, and is able to capture intricate non-local interactions in quantum systems, paving the way to new state-of-the-art force fields. We compare the non-locality of MFNs to local MPNNs and global attention MPNNs using linear carbon chains, called cumulenes. In this section, we evaluate the performance of our MFNs models in graph-level prediction tasks using GCN layers for the matrix construction. |
| Researcher Affiliation | Collaboration | 1 University of Cambridge, UK 3 University of British Columbia, Canada FAF is employed by Astra Zeneca at time of publication; however, none of the work presented in this manuscript was conducted at or influenced by this affiliation. |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. It describes the methods and architecture in text and with diagrams. |
| Open Source Code | No | The MFN code will be available at: https: //github.com/ilyes319/mfn. |
| Open Datasets | Yes | The ZINC and TU datasets are publicly available. The cumulene dataset is available at: https://github.com/Lars Schaaf/ Guaranteed-Non-Local-Molecular-Dataset. |
| Dataset Splits | Yes | The training set contains geometryoptimized cumulenes with 3-10 and 13, 14 carbon atoms, which are then rattled and rotated at various angles. In total, the train, validation, and test set contain 200, 50 and 170 configurations. |
| Hardware Specification | Yes | We time the models energy and forces evaluation on an A100 GPU. |
| Software Dependencies | No | The paper mentions software like ORCA quantum chemistry code and the Adam W optimizer, but it does not provide specific version numbers for these or other key software dependencies required for replication. |
| Experiment Setup | Yes | Table 5: Model training parameters. For the matrix functions the number of poles (np) and matrix channels (c) are indicated. Table 6: Model training parameters. For the matrix functions the number of poles (np) and matrix channels (c) are indicated. Models were trained with Adam W, with default parameters of β1 = 0.9, β2 = 0.999, and ϵ = 10 8 and a weight decay of 5e 5. We used a learning rate of 0.001 and a batch size of 64. The learning rate was reduced using an on-plateau scheduler. |