Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Diffusion-Based Hierarchical Graph Neural Networks for Simulating Nonlinear Solid Mechanics

Authors: Tobias Würth, Niklas Freymuth, Gerhard Neumann, Luise Kärger

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate ROBIN on challenging 2D and 3D solid mechanics benchmarks involving geometric, material, and contact nonlinearities. ROBIN achieves state-of-the-art accuracy on all tasks, substantially outperforming existing next-step learned simulators while reducing inference time by up to an order of magnitude compared to standard diffusion simulators.
Researcher Affiliation	Academia	Tobias Würth1 Niklas Freymuth2 Gerhard Neumann2 Luise Kärger1 1Institute of Vehicle System Technology, Karlsruhe Institute of Technology, Karlsruhe 2Autonomous Learning Robots, Karlsruhe Institute of Technology, Karlsruhe
Pseudocode	No	The paper describes the methodology using textual descriptions, mathematical equations, and figures (e.g., Figure 1 for an overview of ROBIN prediction), but it does not include explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code	No	We do not provide data and code at the time of submission, but will open-source both after acceptance.
Open Datasets	Yes	We evaluate our model on the three different datasets, namely BENDINGBEAM, IMPACTPLATE [22] and DEFORMINGPLATE [5].
Dataset Splits	Yes	We create a total number of 1000 simulations for training, 100 for validation and 100 for testing.
Hardware Specification	Yes	We train all models on a single NVIDIA A100 GPU with a maximum training time of 48 hours, while most models required approximately 40 hours.
Software Dependencies	No	We implement ROBIN in Py Torch [62] and train it with ADAM [63]. We use the official Tensor Flow [65] implementation of the authors for the baselines HCMT3 [22] and MGN4 [5]. For BSMS [48], we use the official Py Torch [62] implementation5 of the authors. The solutions are created with scikit-fem [61], iteratively solved using Newton-Raphson until the residual fell below a tolerance of 10 8.
Experiment Setup	Yes	We use an exponential learning rate decay, which decreases the learning rate from 1e 4 to 1e 6 over the training time, including 1000 linearly increasing warm-up steps. We clip gradients such that their L2-norm doesn t exceed 1. We train ROBIN in BENDINGBEAM with 9M samples and in IMPACTPLATE with 6M samples both with a batch size of 16, resulting in 562,500 and 375,000 training iterations. In DEFORMINGPLATE we reduce the batch size to 12 and train for 300,000 iterations with 3.6M samples. We use 3 Preand 3 Post-processing layers, 2 Upand 2 Downsampling layers and 5 Solving layers, which yields a total number of 15 learnable layers. We add a layer norm before each MLP and use two linear layers, a hidden size of 128 and a Sigmoid Linear Unit (Si LU) [64] activation function. A max aggregation is used in all message passing layers. We use K = 20 denoising steps and a denoising stride of m = 5 for ROBIN by default. The β variances of the DDPM scheduler are geometrically spaced for training and inference, starting from a minimum noise variance of 1e 4 (for β1) and going up to 1.0 (for βK).