Geometric Algebra Transformer

Authors: Johann Brehmer, Pim de Haan, Sönke Behrends, Taco S. Cohen

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate GATr in problems from n-body modeling to wall-shear-stress estimation on large arterial meshes to robotic motion planning. GATr consistently outperforms both non-geometric and equivariant baselines in terms of error, data efficiency, and scalability.
Researcher Affiliation Industry Johann Brehmer Pim de Haan Sönke Behrends Taco Cohen Qualcomm AI Research {jbrehmer, pim, sbehrend, tacos}@qti.qualcomm.com
Pseudocode No The paper describes the architecture and methods in text and figures (like Fig. 1), but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes Our implementation of GATr is available at https://github.com/qualcomm-ai-research/geometric-algebra-transformer.
Open Datasets Yes We use the single-artery wall-shear-stress dataset published by Suk et al. [48]. We train models on the offline trajectory dataset published by Janner et al. [27].
Dataset Splits Yes We generate training datasets with n = 4 and between 100 and 105 samples; a validation dataset with n = 4 and 5000 samples; a regular evaluation set with n = 4 and 5000 samples; a number-generalization evaluation set with n = 6 and 5000 samples; and a E(3) generalization set with n = 4, an additional translation (see step 4 above), and 5000 samples.
Hardware Specification No The paper mentions running experiments on 'GPU' for scaling analysis, but does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts.
Software Dependencies No The paper mentions software like 'Py Bullet [13]' and an 'efficient attention implementation by Lefaudeux et al. [32]' (xFormers), but does not provide specific version numbers for these software components.
Experiment Setup Yes All models are trained by minimizing a L2 loss on the final position of all objects. We train for 50 000 steps with the Adam optimizer, using a batch size of 64 and exponentially decaying the learning rate from 3 10 4 to 3 10 6.