Improved Operator Learning by Orthogonal Attention

Authors: Zipeng Xiao, Zhongkai Hao, Bokai Lin, Zhijie Deng, Hang Su

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on six standard neural operator benchmark datasets comprising both regular and irregular geometries show that our method can outperform competing baselines with decent margins. We conduct comprehensive experiments on six challenging operator learning benchmarks and achieve satisfactory results: ONO reduces prediction errors by up to 30% compared to baselines and achieves 80% reduction of test error for zero-shot super-resolution on Darcy.
Researcher Affiliation Academia 1Qing Yuan Research Institute, SEIEE, Shanghai Jiao Tong University 2Dept. of Comp. Sci. & Tech., Tsinghua University.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement or link for open-source code availability for the described methodology.
Open Datasets Yes Benchmarks. We first evaluate our model s performance on Darcy and NS2d (Li et al., 2020) benchmarks to evaluate its capability on regular grids. Subsequently, we extend our experiments to benchmarks with irregular geometries, including Airfoil, Plasticity, and Pipe, which are represented in structured meshes, as well as Elasticity, presented in point clouds (Li et al., 2022).
Dataset Splits No The paper does not provide specific details on how the dataset was split into training, validation, and test sets (e.g., percentages, sample counts, or explicit splitting methodology).
Hardware Specification Yes Our experiments are conducted on a single NVIDIA RTX 3090 GPU.
Software Dependencies No The paper mentions specific optimizers and schedulers used (Adam W optimizer, One Cycle Lr scheduler) but does not provide specific version numbers for software dependencies such as programming languages or libraries (e.g., Python, PyTorch).
Experiment Setup Yes We use the l2 relative error in Equation 1 as the training loss and evaluation metric. We train all models for 500 epochs. Our training process employs the Adam W optimizer (Loshchilov & Hutter, 2018) and the One Cycle Lr scheduler (Smith & Topin, 2019). We initialize the learning rate at 10 3 and explore batch sizes within the range of {2, 4, 8, 16}. The model s width is set to 128, while the orthogonalization process employs dimension d as either 8 or 16.