Accelerating Legacy Numerical Solvers by Non-intrusive Gradient-based Meta-solving
Authors: Sohei Arisaka, Qianxiao Li
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We theoretically and numerically show the advantage of the proposed method over other baselines and present applications of accelerating established non-automatic-differentiable numerical solvers implemented in PETSc, a widely used open-source numerical software library. ... In Section 4, we theoretically and numerically show that the training using the proposed method converges faster than the training using the original forward gradient in a simple problem setting. ... In Section 5, we demonstrate an application of the proposed method to accelerate established non-automaticdifferentiable numerical solvers inmplemented in PETSc, resulting in a significant speedup. |
| Researcher Affiliation | Collaboration | 1Department of Mathematics, National University of Singapore 2Kajima Corporation, Japan. Correspondence to: Sohei Arisaka <sohei.arisaka@u.nus.edu>. |
| Pseudocode | Yes | Algorithm 1 Non-intrusive gradient-based meta-solving |
| Open Source Code | Yes | The source code of the experiments is available at https://github.com/arisakaso/nigbms. |
| Open Datasets | No | The paper describes the creation of task distributions P and Q for the toy example (1D Poisson equation) and mentions sampling tasks from them. It also describes sampling source terms for the biharmonic equation and density for linear elasticity. However, it does not provide concrete access information (e.g., specific link, DOI, repository name, formal citation with authors/year) for these custom-generated datasets. It only describes the process of generating the data. |
| Dataset Splits | Yes | We sampled 15,000 tasks from each distribution of P and Q, and evenly and randomly split them into training, validation, and test sets. |
| Hardware Specification | Yes | Note that the experiments are conducted using the following hardwares: Intel Xeon W-3335 CPU @ 3.40GHz and NVIDIA Ge Force RTX 3090. |
| Software Dependencies | No | The paper mentions "Py Torch" for implementing differentiable methods and "PETSc" and "FEniCS" for numerical solvers. However, it does not provide specific version numbers for any of these software dependencies, which are crucial for reproducibility. |
| Experiment Setup | Yes | For surrogate model ˆf, we use a convolutional neural network because the relation of θ coordinates is local. To update the surrogate model ˆf, the Adam optimizer (Kingma & Ba, 2015) with learning rate 0.01 is used as ˆ Opt. Note that the surrogate model ˆf is trained online during minimizing f without pre-training. The main optimizer Opt is also the Adam optimizer with α = 0.1 for Sphere function and α = 0.01 for the Rosenbrock function. The number of optimization steps is 250 for Sphere function and 50,000 for Rosenbrock function. For comparison, we also test using the true gradient f, the gradient of the surrogate model ˆf (Jacovi et al., 2019), and the forward gradient gv,ϵ (Belouze, 2022) instead of our control variate forward gradient hv,ϵ. We set finite difference step size ϵ = 10 8. ... For the Jacobi method, we use learning rate 10 5 for the meta-solver and 5.0 10 4 for the surrogate model. For the multigrid method, we use learning rate 10 6 for the meta-solver and 5.0 10 4 for the surrogate model. For the supervised baseline, we use learning rate 10 4. The batch size is set to 256 for all training, and the number of epochs is set to 100. The best model is selected based on the validation loss. The finite difference step size ϵ is set to 10 12 for all training. |