Challenges in Training PINNs: A Loss Landscape Perspective
Authors: Pratik Rathore, Weimu Lei, Zachary Frangella, Lu Lu, Madeleine Udell
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on optimizing PINNs for convection, wave PDEs, and a reaction ODE. These equations have been studied in previous works investigating difficulties in training PINNs; we use the formulations in Krishnapriyan et al. (2021); Wang et al. (2022b) for our experiments. The coefficient settings we use for these equations are considered challenging in the literature (Krishnapriyan et al., 2021; Wang et al., 2022b). |
| Researcher Affiliation | Academia | 1Department of Electrical Engineering, Stanford University, Stanford, CA, USA 2ICME, Stanford University, Stanford, CA, USA 3Department of Management Science & Engineering, Stanford University, Stanford, CA, USA 4Department of Statistics and Data Science, Yale University, New Haven, CT, USA. Correspondence to: Pratik Rathore <pratikr@stanford.edu>. |
| Pseudocode | Yes | Algorithm 1 Gradient-Damped Newton Descent (GDND), Algorithm 2 Unrolling the L-BFGS Update, Algorithm 3 Performing matrix-vector product, Algorithm 4 Nys Newton-CG (NNCG), Algorithm 5 Randomized Nystr om Approximation, Algorithm 6 Nystr om PCG, Algorithm 7 Armijo |
| Open Source Code | Yes | The code for our experiments is available at https://github.com/pratikrathore8/opt for pinns. |
| Open Datasets | No | The paper describes how the data points were sampled/generated ('10000 residual points randomly sampled from a 255 100 grid', '257 equally spaced points for the initial conditions', '101 equally spaced points for each boundary condition') but does not provide concrete access information (link, DOI, repository name, formal citation with authors/year) for a publicly available or open dataset. |
| Dataset Splits | No | The paper describes the points used for training the PINN loss (residual, initial, and boundary points) and states that L2RE is computed using 'all points in the 255 100 grid on the interior of the problem domain, along with the 257 and 101 points used for the initial and boundary conditions'. However, it does not provide specific dataset split information for train/validation/test (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) as might be found in traditional machine learning tasks. |
| Hardware Specification | Yes | Each experiment is run on a single NVIDIA Titan V GPU using CUDA 11.8. |
| Software Dependencies | Yes | We develop our experiments in Py Torch 2.0.0 (Paszke et al., 2019) with Python 3.10.12. |
| Experiment Setup | Yes | For Adam, we tune the learning rate by a grid search on {10 5, 10 4, 10 3, 10 2, 10 1}. For L-BFGS, we use the default learning rate 1.0, memory size 100, and strong Wolfe line search. For Adam+L-BFGS, we tune the learning rate for Adam as before, and also vary the switch from Adam to L-BFGS (after 1000, 11000, 31000 iterations). These correspond to Adam+L-BFGS (1k), Adam+L-BFGS (11k), and Adam+L-BFGS (31k) in our figures. All three methods are run for a total of 41000 iterations. We use multilayer perceptrons (MLPs) with tanh activations and three hidden layers. These MLPs have widths 50, 100, 200, or 400. We initialize these networks with the Xavier normal initialization (Glorot & Bengio, 2010) and all biases equal to zero. |