Learning Neural PDE Solvers with Parameter-Guided Channel Attention
Authors: Makoto Takamoto, Francesco Alesiani, Mathias Niepert
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare CAPE in conjunction with the curriculum learning strategy using a popular PDE benchmark and obtain consistent and significant improvements over the baseline models. The experiments also show several advantages of CAPE, such as its increased ability to generalize to unseen PDE parameters without large increases inference time and parameter count. |
| Researcher Affiliation | Collaboration | 1NEC Laboratories Europe, Heidelberg, Germany 2University of Stuttgart, Stuttgart, Germany. Correspondence to: Makoto Takamoto <makoto.takamoto@neclab.eu>. |
| Pseudocode | Yes | Algorithm 1 Algorithm of the curriculum training strategy |
| Open Source Code | Yes | An implementation of the method and experiments are available at https://github. com/nec-research/CAPE-ML4Sci. |
| Open Datasets | Yes | We used datasets provided by PDEBench (Pradita et al., 2022) a benchmark for Sci ML from which we selected the following PDEs |
| Dataset Splits | Yes | For 1-dimensional PDEs, we used N = 9000 training instances and 1000 test instances for each PDE parameter with resolution 128. For 2-dimensional NS equations, we used N = 900 training instances and 100 test instances for each PDE parameter with spatial resolution 64 64. |
| Hardware Specification | Yes | The training was performed on Ge Force RTX 2080 GPU for 1D PDEs and Ge Force GTX 3090 for 2D NS equations. The experiments were run using an Nvidia Ge Force RTX 3090 with CUDA-11.6. |
| Software Dependencies | Yes | The ML models are implemented using Py Torch 1.12.1 and the numerical simulations with JAX-0.3.17. |
| Experiment Setup | Yes | The optimization was performed with Adam (Kingma & Ba) for 100 epochs. The learning rate was set as 3 10 3 which is divided by 2.0 every 20 epochs. The mini-batch size we used was 50 for all the cases. To stabilize the CAPE module s training in the initial phase, we empirically found it is a little better if we have a warm-up phase during which only CAPE module is updated. We performed warm-up for the first 3 epochs, which slightly reduce the final performance fluctuations resulting from the randomness of the initial weights of the network. In the CAPE module, the kernel size of the depth-wise convolution was set as: 5. |