Characterizing possible failure modes in physics-informed neural networks
Authors: Aditi Krishnapriyan, Amir Gholami, Shandian Zhe, Robert Kirby, Michael W. Mahoney
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Recent work in scientific machine learning has developed so-called physicsinformed neural network (PINN) models. The typical approach is to incorporate physical domain knowledge as soft constraints on an empirical loss function and use existing machine learning methodologies to train the model. We demonstrate that, while existing PINN methodologies can learn good models for relatively trivial problems, they can easily fail to learn relevant physical phenomena for even slightly more complex problems. In particular, we analyze several distinct situations of widespread physical interest, including learning differential equations with convection, reaction, and diffusion operators. We provide evidence that the soft regularization in PINNs, which involves PDE-based differential operators, can introduce a number of subtle problems, including making the problem more ill-conditioned. Importantly, we show that these possible failure modes are not due to the lack of expressivity in the NN architecture, but that the PINN s setup makes the loss landscape very hard to optimize. We then describe two promising solutions to address these failure modes. The first approach is to use curriculum regularization, where the PINN s loss term starts from a simple PDE regularization, and becomes progressively more complex as the NN gets trained. The second approach is to pose the problem as a sequence-to-sequence learning task, rather than learning to predict the entire space-time at once. Extensive testing shows that we can achieve up to 1-2 orders of magnitude lower error with these methods as compared to regular PINN training. |
| Researcher Affiliation | Collaboration | Aditi S. Krishnapriyan ,1,2, Amir Gholami ,2, Shandian Zhe3, Robert M. Kirby3, Michael W. Mahoney2,4 1Lawrence Berkeley National Laboratory, 2University of California, Berkeley, 3University of Utah, 4International Computer Science Institute |
| Pseudocode | No | The paper describes the methods and formulations using prose and mathematical equations but does not include any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | We have open sourced our framework [26] which is built on top of Py Torch both to help with reproducibility and also to enable other researchers to extend the results. |
| Open Datasets | No | The paper describes generating 'collocation points (x, t) on the domain' from the PDE system for training, rather than using a pre-existing, citable, publicly available dataset. |
| Dataset Splits | No | The paper mentions 'randomly sample collocation points (x, t) on the domain' for training and evaluating against 'analytical solution' but does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, or cloud resources) used to run the experiments. |
| Software Dependencies | No | The paper mentions that their framework 'is built on top of Py Torch', but it does not specify the version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | Experiment setup. We study both linear and non-linear PDEs/ODEs... We use a 4-layer fully-connected NN with 50 neurons per layer, a hyperbolic tangent activation function, and randomly sample collocation points (x, t) on the domain... We train this network using the L-BFGS optimizer and sweep over learning rates from 1e 4 to 2.0. |