Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Strategies for Pretraining Neural Operators
Authors: Anthony Zhou, Cooper Lorsung, AmirPouya Hemmasian, Amir Barati Farimani
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the effectiveness of the proposed pretraining strategies and data augmentations, we consider a diverse set of experiments and neural operator architectures to train on. We now systematically benchmark our pretraining and data augmentation strategies, as well as their combination. Presented below are results on our autoregressive task. |
| Researcher Affiliation | Academia | Anthony Zhou1 EMAIL Department of Mechanical Engineering Carnegie Mellon University Cooper Lorsung1 EMAIL Department of Mechanical Engineering Carnegie Mellon University Amir Pouya Hemmasian EMAIL Department of Mechanical Engineering Carnegie Mellon University Amir Barati Farimani EMAIL Department of Mechanical Engineering Department of Machine Learning Carnegie Mellon University |
| Pseudocode | No | The paper does not contain any explicit sections or figures labeled 'Pseudocode' or 'Algorithm'. |
| Open Source Code | Yes | We make code available here: https://github.com/anthonyzhou-1/pretraining_pdes |
| Open Datasets | Yes | and use 2D PDE datasets from Zhou & Farimani (2024b) which can be found here: https://zenodo.org/records/13355846 . |
| Dataset Splits | Yes | During pretraining, 9216 total samples are generated, with 3072 samples of the 2D Heat, Advection, and Burgers equations respectively. ... We generate 1024 samples for the Heat, Advection, Burgers, and Navier-Stokes equations to train with. An additional 1024 out-of-distribution samples for the Heat, Advection, and Burgers equations is also generated. Validation samples are generated similarly to fine-tuning samples... We generate 256 samples for the Heat, Advection, Burgers, and Navier-Stokes equations. |
| Hardware Specification | Yes | All experiments are run on a NVIDIA Ge Force RTX 2080Ti GPU. |
| Software Dependencies | No | We generate labels for derivative regression through taking spatial and time derivatives {ut, ux, uy, uxx, uyy} of the PDE solution field using Fin Diff (Baer, 2018). ... Adam with a learning rate of 1e-3, a weight decay of 1e-6, and a One Cycle scheduler... We use an Adam optimizer with a learning rate of 1e-3, with weight decay of 1e-6, and a Cosine Annealing scheduler. |
| Experiment Setup | Yes | Models are generally trained for 200 epochs with a batch size of 32 using Adam with a learning rate of 1e-3, a weight decay of 1e-6, and a One Cycle scheduler for five seeds. ... We use an Adam optimizer with a learning rate of 1e-3, with weight decay of 1e-6, and a Cosine Annealing scheduler. |