Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Scalable Mechanistic Neural Networks

Authors: Jiale Chen, Dingling Yao, Adeel Pervez, Dan Alistarh, Francesco Locatello

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources. Consequently, S-MNN can drop-in replace the original MNN in applications, providing a practical and efficient tool for integrating mechanistic bottlenecks into neural network models of complex dynamical systems.
Researcher Affiliation	Academia	Jiale Chen, Dingling Yao, Adeel Pervez, Dan Alistarh, Francesco Locatello Institute of Science and Technology Austria (ISTA) EMAIL
Pseudocode	Yes	Algorithm 1: Solver Forward Pass Algorithm 2: Solver Backward Pass Algorithm 3: Decompose Algorithm 4: Substitute Algorithm 5: Train a Epoch Algorithm 6: Test
Open Source Code	Yes	Source code is available at https://github.com/IST-DASLab/Scalable_MNN. To facilitate this, we share the source code of both our S-MNN solver and experiments as part of the supplementary materials.
Open Datasets	Yes	To demonstrate the effectiveness and scalability of our proposed Scalable Mechanistic Neural Network (S-MNN), we conduct experiments across multiple settings in scientific machine learning applications for dynamical systems including governing equation discovery for the Lorenz system (Section 5.2), solving the Korteweg-de Vries (Kd V) partial derivative equation (PDE) (Section 5.3), and sea surface temperature (SST) prediction for modeling long real-world temporal sequences (Section 5.4). Experiment Settings. We select five linear ODE problems from ODEBench (d Ascoli et al., 2024) RC Circuit, Population Growth, Language Death Model, Harmonic Oscillator, and Harmonic Oscillator with Damping that are commonly used in various scientific fields such as physics and biology, along with an additional third-order ODE. Mathematical details about these ODEs are provided in Appendix B.1. Dataset. We consider the Kd V dataset provided by Brandstetter et al. (2022). Dataset. We use the SST-V2 dataset (Huang et al., 2021), which provides weekly mean sea surface temperatures for 1,727 weeks from December 31, 1989, to January 29, 2023, over a 1 latitude 1 longitude global grid (180 360).
Dataset Splits	Yes	Dataset. We consider the Kd V dataset provided by Brandstetter et al. (2022). The dataset consists of 512 samples each for training, validation, and testing. Dataset. We use the SST-V2 dataset (Huang et al., 2021), which provides weekly mean sea surface temperatures for 1,727 weeks from December 31, 1989, to January 29, 2023, over a 1 latitude 1 longitude global grid (180 360). Experiment Settings. We employ S-MNN and MNN with the Mechanistic Identifier proposed by Yao et al. (2024) to predict SST data. ... The dataset is split so that the latest chunk of measurements is reserved for testing while the remaining data is used for training.
Hardware Specification	Yes	Then, to assess the scalability of our method, we measure the runtime and the GPU memory consumption across different sequence lengths and batch sizes using an NVIDIA H100 GPU with 80 GB of memory. We run all solvers on CPUs (AMD EPYC 7513 32-Core Processor).
Software Dependencies	No	The paper mentions software like "scipy.integrate.odeint function from Sci Py" and "scipy.integrate.solve_ivp function from Sci Py" and "CUDA Graphs" but does not specify version numbers for these or other key libraries/frameworks.
Experiment Setup	Yes	Experiment Settings. We apply our solver to the same network architecture and dataset in Pervez et al. (2024). We train the model with the default settings: sequence length of 50 and batch size of 512. Experiment Settings. We model the temporal evolution at each spatial point as an independent ODE. ... The sequence length is set to 10 seconds, and the model is trained to predict the wave profile over the next 9 seconds. The model is trained for 800 epochs. Experiment Settings. ... We set the default batch size to 12,960 (corresponding to 6,480 pairs of grid points and their randomly selected neighboring points) and the sequence length (chunk length) to 208 weeks. ... The model is trained for 1,000 epochs.