Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
ENMA: Tokenwise Autoregression for Continuous Neural PDE Operators
Authors: Armand Kassaï Koupaï, Lise Le Boudec, Louis Serrano, Patrick Gallinari
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to evaluate ENMA. Section 4.2 assesses the encoder decoder in terms of reconstruction error, time-stepping accuracy, and compression rate, comparing against standard neural operator baselines. Section 4.3 evaluates ENMA s generative forecasting ability for both temporal conditioning and Initial Value Problem with context trajectory. |
| Researcher Affiliation | Collaboration | 1 Sorbonne Université, CNRS, ISIR, 75005 Paris, France 2 Criteo AI Lab, Paris, France |
| Pseudocode | Yes | We present the full inference method of ENMA for latent generation in the pseudo-code 1: Algorithm 1: ENMA Inference: Autoregressive Latent Generation with Cosine Masked Decoding |
| Open Source Code | No | We will release the code upon acceptance for reproducibility. In the mean time, we precisely detail all training and inference details in the appendices. |
| Open Datasets | Yes | We evaluate ENMA on standard public benchmarks (Rayleigh Bénard and Active Matter, (Ohana et al., 2024)). |
| Dataset Splits | Yes | For each system, we generate 12,000 training and 1,200 test trajectories, using a batch size of 10. For evaluation, we generate two test sets: 1,200 trajectories for in-distribution (In-D) and 120 for out-of-distribution (Out-D) evaluation. |
| Hardware Specification | Yes | All experiments were conducted on a A100. |
| Software Dependencies | No | The code has been written in Pytorch (Paszke et al., 2019). |
| Experiment Setup | Yes | Optimizer and Learning Rate Schedule We use the Adam W optimizer with β1 = 0.9 and β2 = 0.95 for all experiments. The learning rate follows a cosine decay schedule, starting from an initial value of 10 3 and annealing to 10 5 over the course of training. To stabilize the early training phase, we apply a linear warmup over the first 500 optimization steps. |