Swarm Reinforcement Learning for Adaptive Mesh Refinement

Authors: Niklas Freymuth, Philipp Dahlinger, Tobias Würth, Simon Reisch, Luise Kärger, Gerhard Neumann

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we show the effectiveness of our approach on a suite of PDEs that require complex and challenging refinement strategies, including a non-stationary heat diffusion problem and a linear elasticity task. We implement our tasks as Open AI gym [40] environments. We implement and compare to current state-of-the-art RL methods for AMR [28, 29, 30] that have been shown to work well on dynamic tasks where shallow mesh refinement and coarsening is sufficient1. We conduct a series of ablations to show which parts of the approach make it uniquely effective... We visualize the mesh quality quantitatively with a Pareto plot of the number of elements and the remaining error.
Researcher Affiliation Academia Niklas Freymuth1 Philipp Dahlinger1 Tobias Würth2 Simon Reisch1 Luise Kärger2 Gerhard Neumann1 1Autonomous Learning Robots, Karlsruhe Institute of Technology, Karlsruhe 2Institute of Vehicle Systems Technology, Karlsruhe Institute of Technology, Karlsruhe correspondence to niklas.freymuth@kit.edu
Pseudocode No The paper describes the Message Passing Network Architecture in Appendix A using mathematical equations and textual descriptions, but it does not include a block explicitly labeled as 'Pseudocode' or 'Algorithm'.
Open Source Code Yes We publish the first codebase on RL for AMR, including all methods and tasks presented in this paper, to facilitate research in this direction. The code is available at https://github.com/NiklasFreymuth/ASMR.
Open Datasets No All learned methods are trained on 100 PDEs and their corresponding initial and reference meshes Ω0, Ω to limit the number of required reference meshes during training.
Dataset Splits No All learned methods are trained on 100 PDEs...We evaluate the resulting final policies on 100 different evaluation PDEs that we keep consistent across random seeds for better comparability. These PDEs are disjoint from the training PDEs...
Hardware Specification Yes All experiments are run for up to 2 days on 8 cores of an Intel Xeon Platinum 8358 CPU. ... We use a single 8-Core AMD Ryzen 7 3700X Processor for all measurements.
Software Dependencies No All networks are implemented in Py Torch [104] and trained using the ADAM optimizer [105] with a learning rate of 3.0e-4 unless mentioned otherwise. ... The PDEs and the FEM are implemented using scikit-fem [89], and we use conforming triangular meshes and linear elements unless mentioned otherwise. The code provides Open AI gym [40] environments for all tasks.
Experiment Setup Yes We train each PPO policy for a total of 400 iterations. In each iteration, the algorithm samples 256 environment transitions and then trains on them for 5 epochs with a batch size of 32. The value function loss is multiplied with a factor of 0.5 and we clip the gradient norm to 0.5. The policy and value function clip ranges are chosen to be 0.2. We normalize the observations with a running mean and standard deviation. The discount factor is γ = 0.99 and advantages are estimated via Generalized Advantage Estimate [99] with λ = 0.95.