A Neural-Preconditioned Poisson Solver for Mixed Dirichlet and Neumann Boundary Conditions
Authors: Kai Weixian Lan, Elias Gueidon, Ayano Kaneda, Julian Panetta, Joseph Teran
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that our solver outperforms state-of-the-art methods like algebraic multigrid as well as recently proposed neural preconditioners on challenging test cases arising from incompressible fluid simulations. We demonstrate through a comprehensive benchmark on challenging fluid-simulation test cases that, when paired with an appropriate iterative method, our neural-preconditioned solver dramatically outperforms state-of-the-art solvers like algebraic multigrid and incomplete Cholesky, as well as recent neural preconditioners like DCDM and Fluid Net. |
| Researcher Affiliation | Academia | 1University of California, Davis, USA 2University of California, Los Angeles, USA 3Waseda University, Tokyo, Japan. |
| Pseudocode | Yes | Algorithm 1 Neural-preconditioned Steepest Descent with A-Orthogonalization (NPSDO). Algorithm 2 Operations executed by an L-level network in pseudocode form. |
| Open Source Code | Yes | To promote reproducibility, we have released our full code and a link to our pretrained model at https://github. com/kai-lan/MLPCG/tree/icml2024. |
| Open Datasets | No | Our training data set consists of 107 matrices collected from 11 different simulation scenes, some of domain shape (128, 128, 128) and others (256, 128, 128). For each matrix, we generate 800 right-hand side vectors using a similar approach to (Kaneda et al., 2023) but with far fewer Rayleigh-Ritz vectors. We first compute 1600 Ritz vectors using Lanczos iterations (Lanczos, 1950) and then generate from them 800 random linear combinations. These linear combinations are finally normalized and added to the training set. No concrete access information (link, DOI, specific citation with author/year) is provided for this generated dataset. |
| Dataset Splits | No | Our training data set consists of 107 matrices collected from 11 different simulation scenes... For each matrix, we process all of its 800 right-hand sides in batches of 128, repeating five times. The full training process takes around 5 days... We utilize the transfer learning technique (Pan & Yang, 2010), training first a 5-level network and using those weights to initialize a 6-level network, which is subsequently fine-tuned and used for all experiments. No explicit mention of validation splits (e.g., percentages, sample counts, or a separate validation dataset) is provided. |
| Hardware Specification | Yes | We executed all benchmarks on a workstation featuring an AMD Ryzen 9 5950X 16-Core Processor and an NVIDIA Ge Force RTX 3080 GPU. |
| Software Dependencies | No | We built our network using Py Torch (Paszke et al., 2019), but implemented our convolutional and linear blocks as custom CUDA extensions. We compare against baselines using libraries such as Cu Py (Okuta et al., 2017), AMGCL (Demidov, 2020), NVIDIA’s Amg X (Naumov et al., 2015), NVIDIA’s cu Sparse library, and CHOLMOD (Chen et al., 2008). However, specific version numbers for these software components are not provided. |
| Experiment Setup | Yes | We train our network Pnet to approximate AI b when presented with image I and input vector b. We calculate the loss for an example (I, AI, b) from our training dataset as the residual norm: Loss = b AIPnet(I, b) 2 . In each epoch of training, we loop over all matrices in our dataset in shuffled order. For each matrix, we process all of its 800 right-hand sides in batches of 128, repeating five times. We empirically determined northo = 2 to perform well (see Appendix A.4) and use this value in all reported experiments. The neural network was trained using single-precision floating point. We used as our convergence criterion for all methods a reduction of the residual norm by a factor of 106, which is sufficiently accurate to eliminate visible simulation artifacts. The network architecture is defined recursively, consisting of levels 1 ℓ L. |