Spatial Attention Kinetic Networks with E(n)-Equivariance

Authors: Yuanqing Wang, John Chodera

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the robustness and versatility of SAKE, we benchmark its performance on potential energy approximation and dynamical system forecasting and sampling tasks. For all popular benchmarks, compared to stateof-the-art models, SAKE achieves competitive performance on a wide range of invariant (MD17: Table 1, QM9: Table 3, ISO17: Table 2) and equivariant (N-body charged particle Table 4, walking motion: Table 6) while requiring only a fraction of their training and inference time. 6 EXPERIMENTS As discussed in Section 2, SAKE provides a mapping from and to the joint space of geometry and embedding X H, while being equivariant on geometric space and invariant on embedding space. We are therefore interested to characterize the performance of SAKE on two types of tasks: invariant modeling (Section 6.1), where we model some scalar property of a physical system, namely potential energy; and equivariant modeling (Section 6.2), where we predict coördinates conditioned on initial position, velocity, and embedding. On both classes of tasks, SAKE displays competitive performance while requiring significantly less inference time compared to current state-of-the-art models. See Appendix Section 9 for experimental details and settings.
Researcher Affiliation Academia Yuanqing Wang and John D. Chodera Computational and Systems Biology Program, Sloan Kettering Institute Memorial Sloan Kettering Cancer Center, New York, N.Y. 10065 yuanqing.wang@choderalab.org Alternative address: Ph.D. Program in Physiology, Biophysics, and System Biology, Weill Cornell Medical College, Cornell Univerisity, New York, N.Y. 10065
Pseudocode Yes Algorithm 1 Spatial Attention Kinetic Networks Layer
Open Source Code Yes Implementation: https://github.com/choderalab/sake. The software package containing the algorithm proposed here is distributed open source under MIT license. All necessary code, data, and details to reproduce the experiments can be found in Appendix Section 9. The corresponding software package and all scripts used to conduct the experiments in this paper are distributed open source under MIT license at: https://github.com/choderalab/sake This package can be installed via: pip install sake-gnn.
Open Datasets Yes For all popular benchmarks, compared to stateof-the-art models, SAKE achieves competitive performance on a wide range of invariant (MD17: Table 1, QM9: Table 3, ISO17: Table 2) and equivariant (N-body charged particle Table 4, walking motion: Table 6). Table 9: Dataset details. MD17 http://quantum-machine.org/gdml/#datasets (Table 1) 8 Systems; 100K-1M snapshots Random: 1K Train ISO17 http://quantum-machine.org/datasets/(Table 2) 129 Molecules; 5000 snapshots Fixed QM9 http://quantum-machine.org/datasets/ 135k molecules Fixed N-Body Forecast 2(Table 4) MIT 5 particles Fixed: 3K Train; 2K Valid; 2K Test
Dataset Splits Yes Table 9: Dataset details. N-Body Forecast 2(Table 4) MIT 5 particles Fixed: 3K Train; 2K Valid; 2K Test
Hardware Specification Yes All models are trained on NVIDIA Tesla V100 GPUs. Following the settings reported in the publications of baseline models, the inference time benchmark experiments (Section 6.2, 6.1) are done on NVIDIA Ge Force GTX 1080 Ti GPU (For Table 1 and Table 6) and NVIDIA Ge Force GTX 2080 Ti GPU (For Table 4).
Software Dependencies No The paper mentions software components like "Si LU", "Ce LU", and "Adam optimizer" but does not provide specific version numbers for these or for broader frameworks like PyTorch or Python. For example, it states "Si LU is used everywhere as activation", but not "PyTorch 1.x".
Experiment Setup Yes One-layer feed-forward neural networks are used as fr in Equation 2 (edge update); two-layer feed-forward neural networks are used as ϕe in Equation 2 (edge update), ϕv in Equation 4 (node update), ϕv V in Equation 12 (velocity update), and µ in Equation 6 (spatial attention). Si LU is used everywhere as activation, except in Equation 12 (velocity update) where the last activation function is chosen as y = 2 Sigmoid(x) to constraint the velocity scaling to between 0 and 2 and in Equation 10 where Ce LU is used before attention; additionally, tanh is applied on the additive part of Equation 12 to constraint it to between -1 and 1. 4 attention heads are used with γ in Equation 10 spaced evenly between 0 and 5 Å. 50 RBF basis are used, spacedly evenly between 0 and 5 Å. All models are optimized with Adam optimizer. We summarize the hyperparameters used in these experiments in Table 8.