Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Mixture-of-Experts Operator Transformer for Large-Scale PDE Pre-Training

Authors: Hong Wang, Haiyang Xin, Jie Wang, Xuanze Yang, Fei Zha, huanshuo dong, Yan Jiang

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conducted comprehensive experiments to evaluate the performance of Mo E-POT. This section is organized as follows: 1. Comparison with various small and pre-trained models on 6 PDE datasets. 2. Testing knowledge transfer capabilities on downstream tasks. 3. Investigating scaling laws to understand performance trends. 4. Interpretable analysis of the router-gating network selection. 5. Analyzing model inference time 6. Ablation studies to assess the impact of hyperparameters.
Researcher Affiliation Academia Hong Wang1,2,3 , Haiyang Xin1 , Jie Wang1,2,3 , Xuanze Yang1, Fei Zha1, Huanshuo Dong1,2,3, Yan Jiang1 1 University of Science and Technology of China 2 CAS Key Laboratory of Technology in GIPAS, University of Science and Technology of China 3 Mo E Key Laboratory of Brain-inspired Intelligent Perception and Cognition, University of Science and Technology of China EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes methods using mathematical formulations and textual explanations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code Yes 1Our code is available at https://github.com/haiyangxin/Mo EPOT.
Open Datasets Yes For pre-training, we utilize 6 datasets sourced from 3 benchmark collections: FNO [28], PDEBench [53], and CFDBench [38].
Dataset Splits Yes The train and test dataset sizes used in the pre-training and fine-tuning stages are shown in Table 6. And the train and test dataset sizes for downstream tasks are shown in Table 7.
Hardware Specification Yes Training was conducted on servers equipped with 8 RTX 4090 GPUs, each with 24 GB of memory.
Software Dependencies No The paper mentions using the Adam optimizer and a One-cycle learning rate schedule, but does not specify software dependencies with version numbers (e.g., Python, PyTorch, CUDA versions).
Experiment Setup Yes Across all model sizes, we employed the Adam optimizer with a learning rate of 1 10 3 and trained the models for 1000 epochs. ... For the pretraining stage, we set the learning rate to 1 10 3 and used a One-cycle learning rate schedule over 1000 epochs, with the first 200 epochs as the warm-up phase. The Adam optimizer was employed with a weight decay of 1 10 6 and momentum parameters (β1, β2) = (0.9, 0.9). ... The patch size was set to 8.