Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints

Authors: Utkarsh Utkarsh, Pengfei Cai, Alan Edelman, Rafael Gomez-Bombarelli, Christopher Rackauckas

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, PCFM outperforms both unconstrained and constrained baselines on a range of PDEs, including those with shocks, discontinuities, and sharp features, while ensuring exact constraint satisfaction at the final solution. Our method provides a flexible framework for enforcing hard constraints in both scientific and general-purpose generative models, especially in applications where constraint satisfaction is essential.
Researcher Affiliation	Academia	Utkarsh Massachusetts Institute of Technology Pengfei Cai Massachusetts Institute of Technology Alan Edelman Massachusetts Institute of Technology Rafael Gomez-Bombarelli Massachusetts Institute of Technology Christopher Vincent Rackauckas Massachusetts Institute of Technology
Pseudocode	Yes	Algorithm 1 PCFM: Physics-Constrained Flow Matching Require: Flow model vθ(u, τ), constraint residual h(u), initial state u0, steps N, penalty λ Ensure: Final state u1 such that h(u1) = 0
Open Source Code	Yes	We open-source our code: a Python implementation at https://github.com/cpfpengfei/PCFM and an experimental Julia implementation at https://github.com/utkarsh530/PCFM.jl.
Open Datasets	No	For every problem, we construct a PDE numerical solution dataset with two degrees of freedom by varying initial (IC) and boundary conditions (BC) (see Appendix B), and pre-train an unconditional FFM model [11] on this dataset (see Appendix C).
Dataset Splits	Yes	For training, we generate 10000 simulations by sampling 100 random initial vorticities and 100 forcing phases. For test dataset, we sample an additional 10 vorticities and 100 forces, yielding 1000 solutions.
Hardware Specification	Yes	FFM models for Heat, Reaction-Diffusion, and Burgers are each trained over 20, 000 steps on 1 NVIDIA V100 GPU and the Navier-Stokes model is trained for 500, 000 steps on 4 NVIDIA A100 GPUs.
Software Dependencies	No	We implement the solver in a batched and differentiable fashion to support inference across samples. For a batch of inputs {ui 1}B i=1, we evaluate Jacobians Ji, residuals h(ui), and solve the corresponding Schur systems in parallel using vectorized operations and autodiff-compatible backends (e.g., Py Torch with batched Cholesky or linear solvers) [71].
Experiment Setup	Yes	Optimizer: Adam optimizer with learning rate 3 10 4, β1 = 0.9, β2 = 0.999, and no weight decay. Learning rate scheduler: Reduce-on-plateau scheduler with a factor of 0.5, patience of 10 validation steps, and a minimum learning rate of 1 10 4. Batch size: 256 for 1D problems (Heat, Reaction-Diffusion, Burgers) and 24 for the 2D Navier-Stokes problem. For all comparisons, we adopt the explicit Euler integration scheme unless otherwise specified. We use 100 Euler update (flow matching) steps for heat and Navier-Stokes, and 200 steps for Reaction Diffusion and Burgers (IC or BC).