Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Unisolver: PDE-Conditional Transformers Towards Universal Neural PDE Solvers
Authors: Hang Zhou, Yuezhou Ma, Haixu Wu, Haowen Wang, Mingsheng Long
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on our own generated dataset and two large-scale benchmarks with various PDE components, where Unisolver achieves consistent state-of-the-art with sharp relative gains. |
| Researcher Affiliation | Academia | 1School of Software, BNRist, Tsinghua University, China. Hang Zhou<EMAIL>. Correspondence to: Haixu Wu <EMAIL>, Mingsheng Long <EMAIL>. |
| Pseudocode | No | The paper includes Equation (2) which formalizes the n-th layer of Unisolver, but it is not explicitly labeled as "Pseudocode" or "Algorithm". There are no other clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/thuml/Unisolver. |
| Open Datasets | Yes | We conduct extensive experiments on our own generated dataset and two large-scale benchmarks with various PDE components... The Heter NS is an extension of the NS dataset from FNO (2021a)... The 1D time-dependent PDEs, introduced by PDEformer (2024)... The 2D mixed PDEs, collected by DPOT (2024)... The dataset can be accessed at the following anonymous link.2 https://drive.google.com/drive/folders/1te5IyQHTznu_Kw7v3zDHg0i_KCHysPKw?usp=share_link |
| Dataset Splits | Yes | For each combination, we generate 1000 samples, yielding a total of 15,000 training samples. The remaining 200 instances are used for testing its performance. In in-distribution tests, the initial conditions vary across samples. Zero-shot generalization settings present much greater challenges... We assess the model s zero-shot performance on 200 samples. |
| Hardware Specification | Yes | Table 2. Summary of benchmarks. #GPU hours are calculated by averaging the training time of all models on one A100 GPU... Our models were trained on servers with 32 NVIDIA A100 GPUs, each with 40GB memory. |
| Software Dependencies | No | The paper mentions using the ADAM optimizer (Kingma & Ba, 2015) and a cosine annealing learning rate scheduler (Loshchilov & Hutter, 2016), and the LLaMA-3 8B model. However, it does not provide specific version numbers for these or for the main deep learning framework (e.g., PyTorch, TensorFlow) or programming language used. |
| Experiment Setup | Yes | All methods in the Heter NS benchmark are trained for 300 epochs using relative L2 loss and the ADAM optimizer (Kingma & Ba, 2015) with an initial learning rate of 0.0005 and a cosine annealing learning rate scheduler (Loshchilov & Hutter, 2016). The batch size is set to 60. For the 1D time-dependent PDEs and 2D mixed PDEs, we follow the training strategies from the original papers of PDEformer (2024) and DPOT (2024) to ensure a fair comparison. Relative L2 is used as the evaluation metric. See Appendix H for full implementation details and hyper-parameter configurations of each model. |