Spatial Attention Kinetic Networks with E(n)-Equivariance
Authors: Yuanqing Wang, John Chodera
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To demonstrate the robustness and versatility of SAKE, we benchmark its performance on potential energy approximation and dynamical system forecasting and sampling tasks. For all popular benchmarks, compared to stateof-the-art models, SAKE achieves competitive performance on a wide range of invariant (MD17: Table 1, QM9: Table 3, ISO17: Table 2) and equivariant (N-body charged particle Table 4, walking motion: Table 6) while requiring only a fraction of their training and inference time. 6 EXPERIMENTS As discussed in Section 2, SAKE provides a mapping from and to the joint space of geometry and embedding X H, while being equivariant on geometric space and invariant on embedding space. We are therefore interested to characterize the performance of SAKE on two types of tasks: invariant modeling (Section 6.1), where we model some scalar property of a physical system, namely potential energy; and equivariant modeling (Section 6.2), where we predict coördinates conditioned on initial position, velocity, and embedding. On both classes of tasks, SAKE displays competitive performance while requiring significantly less inference time compared to current state-of-the-art models. See Appendix Section 9 for experimental details and settings. |
| Researcher Affiliation | Academia | Yuanqing Wang and John D. Chodera Computational and Systems Biology Program, Sloan Kettering Institute Memorial Sloan Kettering Cancer Center, New York, N.Y. 10065 yuanqing.wang@choderalab.org Alternative address: Ph.D. Program in Physiology, Biophysics, and System Biology, Weill Cornell Medical College, Cornell Univerisity, New York, N.Y. 10065 |
| Pseudocode | Yes | Algorithm 1 Spatial Attention Kinetic Networks Layer |
| Open Source Code | Yes | Implementation: https://github.com/choderalab/sake. The software package containing the algorithm proposed here is distributed open source under MIT license. All necessary code, data, and details to reproduce the experiments can be found in Appendix Section 9. The corresponding software package and all scripts used to conduct the experiments in this paper are distributed open source under MIT license at: https://github.com/choderalab/sake This package can be installed via: pip install sake-gnn. |
| Open Datasets | Yes | For all popular benchmarks, compared to stateof-the-art models, SAKE achieves competitive performance on a wide range of invariant (MD17: Table 1, QM9: Table 3, ISO17: Table 2) and equivariant (N-body charged particle Table 4, walking motion: Table 6). Table 9: Dataset details. MD17 http://quantum-machine.org/gdml/#datasets (Table 1) 8 Systems; 100K-1M snapshots Random: 1K Train ISO17 http://quantum-machine.org/datasets/(Table 2) 129 Molecules; 5000 snapshots Fixed QM9 http://quantum-machine.org/datasets/ 135k molecules Fixed N-Body Forecast 2(Table 4) MIT 5 particles Fixed: 3K Train; 2K Valid; 2K Test |
| Dataset Splits | Yes | Table 9: Dataset details. N-Body Forecast 2(Table 4) MIT 5 particles Fixed: 3K Train; 2K Valid; 2K Test |
| Hardware Specification | Yes | All models are trained on NVIDIA Tesla V100 GPUs. Following the settings reported in the publications of baseline models, the inference time benchmark experiments (Section 6.2, 6.1) are done on NVIDIA Ge Force GTX 1080 Ti GPU (For Table 1 and Table 6) and NVIDIA Ge Force GTX 2080 Ti GPU (For Table 4). |
| Software Dependencies | No | The paper mentions software components like "Si LU", "Ce LU", and "Adam optimizer" but does not provide specific version numbers for these or for broader frameworks like PyTorch or Python. For example, it states "Si LU is used everywhere as activation", but not "PyTorch 1.x". |
| Experiment Setup | Yes | One-layer feed-forward neural networks are used as fr in Equation 2 (edge update); two-layer feed-forward neural networks are used as ϕe in Equation 2 (edge update), ϕv in Equation 4 (node update), ϕv V in Equation 12 (velocity update), and µ in Equation 6 (spatial attention). Si LU is used everywhere as activation, except in Equation 12 (velocity update) where the last activation function is chosen as y = 2 Sigmoid(x) to constraint the velocity scaling to between 0 and 2 and in Equation 10 where Ce LU is used before attention; additionally, tanh is applied on the additive part of Equation 12 to constraint it to between -1 and 1. 4 attention heads are used with γ in Equation 10 spaced evenly between 0 and 5 Å. 50 RBF basis are used, spacedly evenly between 0 and 5 Å. All models are optimized with Adam optimizer. We summarize the hyperparameters used in these experiments in Table 8. |