reproducibilityindex.ai

Choose a Transformer: Fourier or Galerkin

Authors: Shuhao Cao

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we present three operator learning experiments, including the viscid Burgers equation, an interface Darcy ﬂow, and an inverse interface coefﬁcient identiﬁcation problem. The newly proposed simple attention-based operator learner, Galerkin Transformer, shows signiﬁcant improvements in both training cost and evaluation accuracy over its softmax-normalized counterparts.
Researcher Affiliation	Academia	Shuhao Cao Department of Mathematics and Statistics Washington University in St. Louis s.cao@wustl.edu
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	The Py Torch codes to reproduce our results are available as an open-source software. 1 https://github.com/scaomath/galerkin-transformer
Open Datasets	Yes	The data are obtained courtesy of the PDE benchmark under the MIT license.3 https://github.com/zongyi-li/fourier_neural_operator
Dataset Splits	No	The data is split 80%/20% for training/evaluation for all three examples. While a train/test split is mentioned, a separate validation split is not explicitly specified.
Hardware Specification	Yes	The training and evaluation is done on a single GPU with 32GB of memory. Specifically, the reported benchmarks in Table 1 use an NVIDIA A100 GPU.
Software Dependencies	No	The paper mentions several software libraries like PyTorch, NumPy, and SciPy in the acknowledgments, but does not provide specific version numbers for them as dependencies.
Experiment Setup	Yes	All attention-based models match the parameter quota of the baseline, and are trained using the loss in (2) with the same 1cycle scheduler [78] for 100 epochs.