Neural Spectral Methods: Self-supervised learning in the spectral domain

Authors: Yiheng Du, Nithin Chalapathi, Aditi S. Krishnapriyan

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 EXPERIMENTAL RESULTS We compare NSM to different neural operators with diffeerent loss functions (PINN and spectral losses) on several PDEs: 2D Poission ( 4.1), 1D Reaction-Diffusion ( 4.2), and 2D Navier-Stokes ( 4.3) with both forced and unforced flow. NSM is consistently the most accurate method, and orders of magnitudes faster during both training and inference, especially on large grid sizes.
Researcher Affiliation Academia Yiheng Du, Nithin Chalapathi, Aditi S. Krishnapriyan {yihengdu, nithinc, aditik1}@berkeley.edu University of California, Berkeley
Pseudocode No The paper describes the method conceptually and visually (Fig. 1), but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes Our source code is publicly available at https://github.com/ASK-Berkeley/Neural-Spectral-Methods.
Open Datasets No During training, the PDE parameters are sampled online from random fields, as describe above. We use a batch size of 16.
Dataset Splits No During training, the PDE parameters are sampled online from random fields, as describe above. We use a batch size of 16. The learning rate is initialized to 10 3, and exponentially decays to 10 6 through the whole training. Experiments are run with 4 different random seeds, and averaged over the seeds. For each problem, the test set consists of N = 128 PDE parameters, denoted by ϕi. Each ϕi is sampled from the same distribution used at training time, and ui is the corresponding reference solution.
Hardware Specification No We also acknowledge generous support from Google Cloud and AWS Cloud Credit for Research.
Software Dependencies No All experiments are implemented using the JAX framework (Bradbury et al., 2018).
Experiment Setup Yes Training. During training, the PDE parameters are sampled online from random fields, as describe above. We use a batch size of 16. The learning rate is initialized to 10 3, and exponentially decays to 10 6 through the whole training. Experiments are run with 4 different random seeds, and averaged over the seeds. (...) All models use Re LU activations, except those using a PINN loss which totally collapse during the training. Therefore, we use tanh activations for FNO+PINN and T1+PINN, and report these results. (...) For each model, we use 4 layers with 64 hidden dimensions. In each layer, we use 31 modes in both dimensions. All models are trained for 30k steps, except for FNO with a grid size of 256, which requires 100k steps to converge.