reproducibilityindex.ai

Multiple Physics Pretraining for Spatiotemporal Surrogate Models

Authors: Michael McCabe, Bruno Régaldo-Saint Blancard, Liam Parker, Ruben Ohana, Miles Cranmer, Alberto Bietti, Michael Eickenberg, Siavash Golkar, Geraud Krawezik, Francois Lanusse, Mariel Pettee, Tiberiu Tesileanu, Kyunghyun Cho, Shirley Ho

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate the efficacy of our approach on both pretraining and downstream tasks over a broad fluid mechanics-oriented benchmark. We show that a single MPP-pretrained transformer is able to match or outperform task-specific baselines on all pretraining sub-tasks without the need for finetuning.
Researcher Affiliation	Collaboration	1 Flatiron Institute, 2 University of Colorado Boulder, 3 University of Cambridge, 4 Universit e Paris-Saclay, Universit e Paris Cit e, CEA, CNRS, AIM, 5 Physics Division, Lawrence Berkeley National Laboratory, 6 New York University, 7 Prescient Design, Genentech, 8 CIFAR Fellow, 9 Princeton University
Pseudocode	No	The paper includes architectural diagrams and descriptions of its methods but does not contain a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code	Yes	We open-source our code and model weights trained at multiple scales for reproducibility.
Open Datasets	Yes	Data. We use the full collection of two-dimensional time-dependent simulations from PDEBench (Takamoto et al., 2022) as our primary source for diverse pretraining data. ... To train and evaluate our models, we use the publicly available PDEBench dataset (Takamoto et al., 2022). We summarize the data included in this section.
Dataset Splits	Yes	Train/Val/Test: .8/.1/.1 split per dataset on the trajectory level.
Hardware Specification	Yes	Hardware. All training for both pretraining and finetuning is done using Distributed Data Parallel (DDP) across 8 Nvidia H100-80GB GPUs.
Software Dependencies	Yes	Software. All model development and training in this paper is performed using Py Torch 2.0 (Paszke et al., 2019).
Experiment Setup	Yes	For MPP, we train using the following settings: Training Duration: 200K steps Micro-batch size: 8 Accumulation Steps: 5 Optimizer: Adan (Xie et al., 2023) Weight Decay: 1E-3 Drop Path: 0.1 Base LR: DAdaptation (Defazio & Mishchenko, 2023) LR Schedule: Cosine decay Gradient clipping: 1.0