SEA: State-Exchange Attention for High-Fidelity Physics Based Transformers

Authors: Parsa Esmati, Amirhossein Dadashzadeh, Vahid Ardakani, Nicolas Larrosa, Nicolò Grilli

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The SEA integrated transformer demonstrates the state-of-the-art rollout error compared to other competitive baselines. Specifically, we outperform Pb GMR-GMUS Transformer-Real NVP and GMR-GMUS Transformer, with a reduction in error of 88% and 91%, respectively.
Researcher Affiliation Collaboration Parsa Esmati University of Bristol parsa.esmati@bristol.ac.uk ... Vahid Goodarzi Ardakani Sabe Technology Limited vahid.goodarzi@sabe.tech
Pseudocode No The paper describes its methodology using architectural schematics and mathematical equations, but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes The repository for this work is available at: https://github.com/Parsa Esmati/SEA
Open Datasets No The dataset used in this work consists of two fluid mechanics simulations motivated by physical phenomena: flow around a cylinder and the mixing of immiscible fluids. ... The results of these simulations were labeled and used as the ground truth in our experiments.
Dataset Splits Yes To ensure comparability with the literature, we generated 70 trajectories at different Reynolds numbers, with 60% used for training and 20% for validation. ... We generated 40 trajectories for this case, with a similar ratio for training, validation, and testing as in the cylinder flow case.
Hardware Specification Yes The model was trained on an A100 GPU for approximately 2 hours for both datasets.
Software Dependencies No The paper mentions the use of 'Open FOAM' for generating datasets, but it does not specify version numbers for OpenFOAM or any other key software libraries or frameworks used in the implementation of their models (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For consistent comparison with state-of-the-art models [Sun et al., 2023, Han et al., 2022], relative mean squared error is used to quantify the errors. The model was trained on an A100 GPU for approximately 2 hours for both datasets. Furthermore, a consistent Transformer architecture was adopted in both cases, utilizing 1 layer and 8 attention heads. The embedding dimension of the model for the cylinder flow case was set to 1024, in line with the literature [Sun et al., 2023], while a dimension of 2048 was used for the multiphase flow case to effectively capture the interface. Full details of the configurations and datasets are provided in Appendix G and F, respectively.