So3krates: Equivariant attention for interactions on arbitrary length-scales in molecular systems
Authors: Thorben Frank, Oliver Unke, Klaus-Robert Müller
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We then apply SO3KRATES to the well-established MD17 benchmark and show that our model achieves SOTA results, despite is light-weight structure and having only 0.25 0.4x the number of parameters of competitive architectures (Fig. 1c), while achieving speedups of 6 14x and 2 11x for training and inference, respectively (Fig. 1d). |
| Researcher Affiliation | Collaboration | 1 Machine Learning Group, TU Berlin, 10587 Berlin, Germany 2 BIFOLD, Berlin Institute for the Foundations of Learning and Data, Germany 3 Google Research, Brain team, Berlin 4 Department of Artificial Intelligence, Korea University, Seoul 136-713, Korea 5 Max Planck Institut für Informatik, 66123 Saarbrücken, Germany |
| Pseudocode | No | The paper describes the SO3KRATES architecture and its components using text and mathematical equations, but does not include clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | 3https://github.com/thorben-frank/mlff |
| Open Datasets | Yes | Here, we use a subset of the recently introduced QM7-X data set [51], which we call QM7-X250. |
| Dataset Splits | Yes | It contains 250 different molecular structures, each with 80 data points for training, 10 data points for validation and 11-3748 data points for testing (for details, see appendix A.9). |
| Hardware Specification | Yes | * Reference times were taken from [34]. As our own timings were measured on a different GPU, we decreased the reported times according to speedup-factors reported in [37]. For full details, see appendix A.6. (Appendix A.6 mentions: Nvidia P100 vs. Nvidia V100) |
| Software Dependencies | No | The paper mentions software like Flax [55], Optax [56], NumPy [57], and JAX [58] in the references, implying their use, but does not provide specific version numbers for these or other dependencies in the main text or appendices. |
| Experiment Setup | Yes | More details on the implementation, training details and network hyperparameters are given in appendix A.3 and A.13. (Appendix A.13: Hyperparameters: number of layers nl=4, cutoff radius rcut=5.0 Å, maximum degree of SPHCs lmax=1, number of attention heads nheads=4, embedding dimension d=128, learning rate lr=1e-3, batch size 256.) |