Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

FlashMD: long-stride, universal prediction of molecular dynamics

Authors: Filippo Bigi, Sanggyu Chong, Agustinus Kristiadi, Michele Ceriotti

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate Flash MD s accuracy in reproducing equilibrium and time-dependent properties, using both system-specific and general-purpose models, extending the ability of MD simulation to reach the long time scales needed to model microscopic processes of high scientific and technological relevance.
Researcher Affiliation	Academia	Filippo Bigi Institute of Materials Ecole Polytechnique Fédérale de Lausanne Lausanne 1015, Switzerland EMAIL Sanggyu Chong Institute of Materials Ecole Polytechnique Fédérale de Lausanne Lausanne 1015, Switzerland EMAIL Agustinus Kristiadi Department of Computer Science Western University & Vector Institute London, ON N6A 3K7, Canada EMAIL Michele Ceriotti Institute of Materials Ecole Polytechnique Fédérale de Lausanne Lausanne 1015, Switzerland EMAIL
Pseudocode	No	The paper includes schematic diagrams (e.g., Figure 1: "Schematic overview of Flash MD.", Figure 4: "Implementation of the random rotation filter.", Figure 5: "Implementation details of the energy conservation enforcement filter in Flash MD.", Figure 6: "Integration of thermostats and barostats with Flash MD for thermodynamic ensemble control.") that describe the methodology and workflows, but no explicitly labeled pseudocode or algorithm blocks with structured, code-like steps.
Open Source Code	Yes	Helper functions to download universal Flash MD models and to prepare simulations are distributed with the flashmd package available on Py PI. Further information and instructions can be found at https://flashmd.org, including links to the training datasets and scripts to reproduce the reported results on Hugging Face and Materials Cloud.
Open Datasets	Yes	To demonstrate the capabilities of Flash MD, we trained two types of models: water-specific models trained on a dataset of MD trajectories for liquid water, and general-purpose, universal models trained on MD trajectories of structures sampled from the MAD dataset (see Ref. [39]).... All reference MD simulations were performed with the PET-MAD universal MLIP [39].
Dataset Splits	Yes	Water-specific models A water structure at experimental density (at 298 K and 1 atm) was equilibrated with PET-MAD (or q-TIP4P/f for the q-TIP4P/f-based water models discussed in Appendix J). Subsequently, two more structures were generated by increasing and decreasing the volume of the cell by 10%, scaling the atomic positions accordingly. For each of the three resulting structures, NV T equilibration runs were performed at all temperatures between 20 and 1000 K, in steps of 20 K, with a time step of 0.5 fs and a duration of 5 ps, using a Langevin thermostat with a characteristic time of 10 fs. Subsequently, each equilibrated structure was used to produce an NVE MD trajectory of 2 ps with a time step of 0.25 fs. Structures for training were extracted from these trajectories every 100 fs, and augmented with their time-reversed version, for a total of 5400 structures. Universal models 10,000 structures from the MAD dataset, used in the training of PET-MAD [39], baseline MLIP, were randomly selected for reference MD trajectory generation (see Ref. [39] for further details on the MAD dataset). The initial geometry was first energetically optimized with the BFGS algorithm until the maximum force component threshold of 0.01 e V/Å was reached. The energy-optimized system was put through equilibration under the NV T ensemble for 10 ps with timesteps of 0.5 fs. A characteristic time of 100 fs was used in the Langevin thermostat. The final configuration from NV T equilibration was then taken for trajectory production under the NVE ensemble for 2.5 ps with finer timesteps of 0.25 fs. Positions and momenta were recorded every timestep for Flash MD training. Simulations were repeated 10 times for each structure, with a randomly selected temperature between 0 and 1500 K. Structures for training were chosen from these trajectories every 500 fs (5 samples per NVE trajectory to avoid time-correlated samples), and augmented with their time-reversed version, for a total of 1 million structures.
Hardware Specification	Yes	The use of computational resources in this work mainly stems from the generation of the universal dataset of NVE trajectories, which employed 20,000 GPU hours on an Nvidia GH200 cluster. Model training was performed on Nvidia H100 GPUs, for a total of around 3,000 GPU hours. All other experiments, mostly molecular dynamics, were run either on H100 or L40S GPUs, and they do not contribute to the overall total compute in a significant way.
Software Dependencies	Yes	All simulations were performed using the Atomic Simulation Environment [89] (ASE) software (version 3.24).
Experiment Setup	Yes	Optimization is carried out using the Adam [88] optimizer with an initial learning rate of 3 10 4. Learning rate decay is applied at a regular intervals of 100 and 50 epochs for the water and universal models, respectively. Water-specific models A water structure at experimental density (at 298 K and 1 atm) was equilibrated with PET-MAD... For each of the three resulting structures, NV T equilibration runs were performed at all temperatures between 20 and 1000 K, in steps of 20 K, with a time step of 0.5 fs and a duration of 5 ps, using a Langevin thermostat with a characteristic time of 10 fs.