So3krates: Equivariant attention for interactions on arbitrary length-scales in molecular systems

Authors: Thorben Frank, Oliver Unke, Klaus-Robert Müller

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We then apply SO3KRATES to the well-established MD17 benchmark and show that our model achieves SOTA results, despite is light-weight structure and having only 0.25 0.4x the number of parameters of competitive architectures (Fig. 1c), while achieving speedups of 6 14x and 2 11x for training and inference, respectively (Fig. 1d).
Researcher Affiliation Collaboration 1 Machine Learning Group, TU Berlin, 10587 Berlin, Germany 2 BIFOLD, Berlin Institute for the Foundations of Learning and Data, Germany 3 Google Research, Brain team, Berlin 4 Department of Artificial Intelligence, Korea University, Seoul 136-713, Korea 5 Max Planck Institut für Informatik, 66123 Saarbrücken, Germany
Pseudocode No The paper describes the SO3KRATES architecture and its components using text and mathematical equations, but does not include clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes 3https://github.com/thorben-frank/mlff
Open Datasets Yes Here, we use a subset of the recently introduced QM7-X data set [51], which we call QM7-X250.
Dataset Splits Yes It contains 250 different molecular structures, each with 80 data points for training, 10 data points for validation and 11-3748 data points for testing (for details, see appendix A.9).
Hardware Specification Yes * Reference times were taken from [34]. As our own timings were measured on a different GPU, we decreased the reported times according to speedup-factors reported in [37]. For full details, see appendix A.6. (Appendix A.6 mentions: Nvidia P100 vs. Nvidia V100)
Software Dependencies No The paper mentions software like Flax [55], Optax [56], NumPy [57], and JAX [58] in the references, implying their use, but does not provide specific version numbers for these or other dependencies in the main text or appendices.
Experiment Setup Yes More details on the implementation, training details and network hyperparameters are given in appendix A.3 and A.13. (Appendix A.13: Hyperparameters: number of layers nl=4, cutoff radius rcut=5.0 Å, maximum degree of SPHCs lmax=1, number of attention heads nheads=4, embedding dimension d=128, learning rate lr=1e-3, batch size 256.)