reproducibilityindex.ai

Poseidon: Efficient Foundation Models for PDEs

Authors: Maximilian Herde, Bogdan Raonic, Tobias Rohner, Roger Käppeli, Roberto Molinaro, Emmanuel de Bezenac, Siddhartha Mishra

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	POSEIDON is pretrained on a diverse, large scale dataset for the governing equations of fluid dynamics. It is then evaluated on a suite of 15 challenging downstream tasks that include a wide variety of PDE types and operators. We show that POSEIDON exhibits excellent performance across the board by outperforming baselines significantly, both in terms of sample efficiency and accuracy.
Researcher Affiliation	Academia	Maximilian Herde1, Bogdan Raoni c1,2, Tobias Rohner1 Roger Käppeli1 Roberto Molinaro1 Emmanuel de Bézenac1 Siddhartha Mishra1,2 1Seminar for Applied Mathematics, ETH Zurich, Switzerland 2ETH AI Center, Zurich, Switzerland
Pseudocode	No	The paper describes the model architecture and computational realizations through prose and mathematical equations but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Finally, the POSEIDON model as well as underlying pretraining and downstream datasets are open sourced, with code being available at https://github.com/camlab-ethz/poseidon and pretrained models and datasets at https://huggingface.co/camlab-ethz.
Open Datasets	Yes	All these datasets are publicly available with the PDEGYM collection (https://huggingface.co/collections/camlab-ethz/pdegym-665472c2b1181f7d10b40651).
Dataset Splits	Yes	We generated 20000 NS-Sines trajectories of which the first 19640 belong to the training set, the next 120 to the validation set, and the last 240 to the test set.
Hardware Specification	Yes	The model is pretrained on 8 NVIDIA RTX 4090 GPUs using the following (dataparallel) training protocol: ... All our pretrainings were performed in (data-)parallel on 8 NVIDIA Ge Force RTX 4090 GPUs.
Software Dependencies	No	The paper mentions using 'Adam W [41]' as an optimizer and states 'Everything is tightly integrated into Huggingface Transformers [73] and we make heavy use of Huggingface Accelerate for distributed training.' However, specific version numbers for these software components (e.g., PyTorch, Huggingface Transformers, or Accelerate) are not provided, which are crucial for full reproducibility.
Experiment Setup	Yes	Optimizer: Adam W [41] Scheduler: Cosine Decay with linear warmup of 2 epochs Maximum learning rate: 10 3 Weight decay: 0.1 Effective batch size: 640, resulting in a per-device batch size of 80 Number of epochs: 40 Early stopping: No Gradient clipping (maximal norm): 5