Equivariant Matrix Function Neural Networks

Authors: Ilyes Batatia, Lars Leon Schaaf, Gabor Csanyi, Christoph Ortner, Felix Andreas Faber

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The MFN architecture achieves stateof-the-art performance in standard graph benchmarks, such as the ZINC and TU datasets, and is able to capture intricate non-local interactions in quantum systems, paving the way to new state-of-the-art force fields. We compare the non-locality of MFNs to local MPNNs and global attention MPNNs using linear carbon chains, called cumulenes. In this section, we evaluate the performance of our MFNs models in graph-level prediction tasks using GCN layers for the matrix construction.
Researcher Affiliation Collaboration 1 University of Cambridge, UK 3 University of British Columbia, Canada FAF is employed by Astra Zeneca at time of publication; however, none of the work presented in this manuscript was conducted at or influenced by this affiliation.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It describes the methods and architecture in text and with diagrams.
Open Source Code No The MFN code will be available at: https: //github.com/ilyes319/mfn.
Open Datasets Yes The ZINC and TU datasets are publicly available. The cumulene dataset is available at: https://github.com/Lars Schaaf/ Guaranteed-Non-Local-Molecular-Dataset.
Dataset Splits Yes The training set contains geometryoptimized cumulenes with 3-10 and 13, 14 carbon atoms, which are then rattled and rotated at various angles. In total, the train, validation, and test set contain 200, 50 and 170 configurations.
Hardware Specification Yes We time the models energy and forces evaluation on an A100 GPU.
Software Dependencies No The paper mentions software like ORCA quantum chemistry code and the Adam W optimizer, but it does not provide specific version numbers for these or other key software dependencies required for replication.
Experiment Setup Yes Table 5: Model training parameters. For the matrix functions the number of poles (np) and matrix channels (c) are indicated. Table 6: Model training parameters. For the matrix functions the number of poles (np) and matrix channels (c) are indicated. Models were trained with Adam W, with default parameters of β1 = 0.9, β2 = 0.999, and ϵ = 10 8 and a weight decay of 5e 5. We used a learning rate of 0.001 and a batch size of 64. The learning rate was reduced using an on-plateau scheduler.