Improved Operator Learning by Orthogonal Attention
Authors: Zipeng Xiao, Zhongkai Hao, Bokai Lin, Zhijie Deng, Hang Su
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on six standard neural operator benchmark datasets comprising both regular and irregular geometries show that our method can outperform competing baselines with decent margins. We conduct comprehensive experiments on six challenging operator learning benchmarks and achieve satisfactory results: ONO reduces prediction errors by up to 30% compared to baselines and achieves 80% reduction of test error for zero-shot super-resolution on Darcy. |
| Researcher Affiliation | Academia | 1Qing Yuan Research Institute, SEIEE, Shanghai Jiao Tong University 2Dept. of Comp. Sci. & Tech., Tsinghua University. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code availability for the described methodology. |
| Open Datasets | Yes | Benchmarks. We first evaluate our model s performance on Darcy and NS2d (Li et al., 2020) benchmarks to evaluate its capability on regular grids. Subsequently, we extend our experiments to benchmarks with irregular geometries, including Airfoil, Plasticity, and Pipe, which are represented in structured meshes, as well as Elasticity, presented in point clouds (Li et al., 2022). |
| Dataset Splits | No | The paper does not provide specific details on how the dataset was split into training, validation, and test sets (e.g., percentages, sample counts, or explicit splitting methodology). |
| Hardware Specification | Yes | Our experiments are conducted on a single NVIDIA RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions specific optimizers and schedulers used (Adam W optimizer, One Cycle Lr scheduler) but does not provide specific version numbers for software dependencies such as programming languages or libraries (e.g., Python, PyTorch). |
| Experiment Setup | Yes | We use the l2 relative error in Equation 1 as the training loss and evaluation metric. We train all models for 500 epochs. Our training process employs the Adam W optimizer (Loshchilov & Hutter, 2018) and the One Cycle Lr scheduler (Smith & Topin, 2019). We initialize the learning rate at 10 3 and explore batch sizes within the range of {2, 4, 8, 16}. The model s width is set to 128, while the orthogonalization process employs dimension d as either 8 or 16. |