Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Mixture-of-Experts Operator Transformer for Large-Scale PDE Pre-Training

Authors: Hong Wang, Haiyang Xin, Jie Wang, Xuanze Yang, Fei Zha, huanshuo dong, Yan Jiang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted comprehensive experiments to evaluate the performance of Mo E-POT. This section is organized as follows: 1. Comparison with various small and pre-trained models on 6 PDE datasets. 2. Testing knowledge transfer capabilities on downstream tasks. 3. Investigating scaling laws to understand performance trends. 4. Interpretable analysis of the router-gating network selection. 5. Analyzing model inference time 6. Ablation studies to assess the impact of hyperparameters.
Researcher Affiliation	Academia	Hong Wang1,2,3 , Haiyang Xin1 , Jie Wang1,2,3 , Xuanze Yang1, Fei Zha1, Huanshuo Dong1,2,3, Yan Jiang1 1 University of Science and Technology of China 2 CAS Key Laboratory of Technology in GIPAS, University of Science and Technology of China 3 Mo E Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China EMAIL, EMAIL, EMAIL
Pseudocode	No	The paper describes methods using mathematical formulations and textual explanations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code	Yes	1Our code is available at https://github.com/haiyangxin/Mo EPOT.
Open Datasets	Yes	For pre-training, we utilize 6 datasets sourced from 3 benchmark collections: FNO [28], PDEBench [53], and CFDBench [38].
Dataset Splits	Yes	The train and test dataset sizes used in the pre-training and fine-tuning stages are shown in Table 6. And the train and test dataset sizes for downstream tasks are shown in Table 7.
Hardware Specification	Yes	Training was conducted on servers equipped with 8 RTX 4090 GPUs, each with 24 GB of memory.
Software Dependencies	No	The paper mentions using the Adam optimizer and a One-cycle learning rate schedule, but does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup	Yes	Across all model sizes, we employed the Adam optimizer with a learning rate of 1 10 3 and trained the models for 1000 epochs. ... For the pretraining stage, we set the learning rate to 1 10 3 and used a One-cycle learning rate schedule over 1000 epochs, with the first 200 epochs as the warm-up phase. The Adam optimizer was employed with a weight decay of 1 10 6 and momentum parameters (β1, β2) = (0.9, 0.9). ... The patch size was set to 8.