Characterizing possible failure modes in physics-informed neural networks

Authors: Aditi Krishnapriyan, Amir Gholami, Shandian Zhe, Robert Kirby, Michael W. Mahoney

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Recent work in scientific machine learning has developed so-called physicsinformed neural network (PINN) models. The typical approach is to incorporate physical domain knowledge as soft constraints on an empirical loss function and use existing machine learning methodologies to train the model. We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena for even slightly more complex problems. In particular, we analyze several distinct situations of widespread physical interest, including learning differential equations with convection, reaction, and diffusion operators. We provide evidence that the soft regularization in PINNs, which involves PDE-based differential operators, can introduce a number of subtle problems, including making the problem more ill-conditioned. Importantly, we show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN s setup makes the loss landscape very hard to optimize. We then describe two promising solutions to address these failure modes. The first approach is to use curriculum regularization, where the PINN s loss term starts from a simple PDE regularization, and becomes progressively more complex as the NN gets trained. The second approach is to pose the problem as a sequence-to-sequence learning task, rather than learning to predict the entire space-time at once. Extensive testing shows that we can achieve up to 1-2 orders of magnitude lower error with these methods as compared to regular PINN training.
Researcher Affiliation Collaboration Aditi S. Krishnapriyan ,1,2, Amir Gholami ,2, Shandian Zhe3, Robert M. Kirby3, Michael W. Mahoney2,4 1Lawrence Berkeley National Laboratory, 2University of California, Berkeley, 3University of Utah, 4International Computer Science Institute
Pseudocode No The paper describes the methods and formulations using prose and mathematical equations but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes We have open sourced our framework [26] which is built on top of Py Torch both to help with reproducibility and also to enable other researchers to extend the results.
Open Datasets No The paper describes generating 'collocation points (x, t) on the domain' from the PDE system for training, rather than using a pre-existing, citable, publicly available dataset.
Dataset Splits No The paper mentions 'randomly sample collocation points (x, t) on the domain' for training and evaluating against 'analytical solution' but does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts).
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud resources) used to run the experiments.
Software Dependencies No The paper mentions that their framework 'is built on top of Py Torch', but it does not specify the version number for PyTorch or any other software dependencies.
Experiment Setup Yes Experiment setup. We study both linear and non-linear PDEs/ODEs... We use a 4-layer fully-connected NN with 50 neurons per layer, a hyperbolic tangent activation function, and randomly sample collocation points (x, t) on the domain... We train this network using the L-BFGS optimizer and sweep over learning rates from 1e 4 to 2.0.