Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Improved Operator Learning by Orthogonal Attention
Authors: Zipeng Xiao, Zhongkai Hao, Bokai Lin, Zhijie Deng, Hang Su
ICML 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on six standard neural operator benchmark datasets comprising both regular and irregular geometries show that our method can outperform competing baselines with decent margins. We conduct comprehensive experiments on six challenging operator learning benchmarks and achieve satisfactory results: ONO reduces prediction errors by up to 30% compared to baselines and achieves 80% reduction of test error for zero-shot super-resolution on Darcy. |
| Researcher Affiliation | Academia | 1Qing Yuan Research Institute, SEIEE, Shanghai Jiao Tong University 2Dept. of Comp. Sci. & Tech., Tsinghua University. |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code availability for the described methodology. |
| Open Datasets | Yes | Benchmarks. We first evaluate our model s performance on Darcy and NS2d (Li et al., 2020) benchmarks to evaluate its capability on regular grids. Subsequently, we extend our experiments to benchmarks with irregular geometries, including Airfoil, Plasticity, and Pipe, which are represented in structured meshes, as well as Elasticity, presented in point clouds (Li et al., 2022). |
| Dataset Splits | No | The paper does not provide specific details on how the dataset was split into training, validation, and test sets (e.g., percentages, sample counts, or explicit splitting methodology). |
| Hardware Specification | Yes | Our experiments are conducted on a single NVIDIA RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions specific optimizers and schedulers used (Adam W optimizer, One Cycle Lr scheduler) but does not provide specific version numbers for software dependencies such as programming languages or libraries (e.g., Python, PyTorch). |
| Experiment Setup | Yes | We use the l2 relative error in Equation 1 as the training loss and evaluation metric. We train all models for 500 epochs. Our training process employs the Adam W optimizer (Loshchilov & Hutter, 2018) and the One Cycle Lr scheduler (Smith & Topin, 2019). We initialize the learning rate at 10 3 and explore batch sizes within the range of {2, 4, 8, 16}. The model s width is set to 128, while the orthogonalization process employs dimension d as either 8 or 16. |