reproducibilityindex.ai

Data-Efficient Operator Learning via Unsupervised Pretraining and In-Context Learning

Authors: Wuyang Chen, Jialin Song, Pu Ren, Shashank Subramanian, Dmitriy Morozov, Michael W. Mahoney

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive empirical evaluations on a diverse set of PDEs demonstrate that our method is highly data-efficient, more generalizable, and even outperforms conventional vision-pretrained models.
Researcher Affiliation	Collaboration	Wuyang Chen Simon Fraser University Jialin Song Simon Fraser University Pu Ren Lawrence Berkeley National Laboratory Shashank Subramanian Lawrence Berkeley National Laboratory Dmitriy Morozov Lawrence Berkeley National Laboratory Michael W. Mahoney International Computer Science Institute Lawrence Berkeley National Laboratory University of California, Berkeley
Pseudocode	Yes	Algorithm 1: Pseudocode of Similarity-based Mining of In-Context Examples.
Open Source Code	Yes	We provide our code at https://github.com/delta-lab-ai/data_efficient_nopt.
Open Datasets	Yes	ECMWF Reanalysis v5 (ERA5) [30] is a public extensive dataset, ... Scalar Flow [14] is a reconstruction of real-world smoke plumes. ... Airfoil [75] is a large-scale dataset ... We generate data for Poisson and Helmholtz [73], Reaction-Diffusion on PDE-Bench [74] and 2D incompressible Navier-Stokes on PINO Dataset [46] following the procedure mentioned in the paper.
Dataset Splits	Yes	We split 75% of the data for pretrain and 25% for finetune. For each split, we further separate 80% of the data for training, 10% for validation, and 10% for testing.
Hardware Specification	Yes	We conducted our experiments on four A100 GPUs, each with 40GB of memory.
Software Dependencies	No	We train Video MAE with Adam, with other hyperparameters the same as in Table 3 column N.S. (PDEBench). ... DAdapt : adaptive learning rate by D-adaptation [10]. It does not provide specific version numbers for software dependencies.
Experiment Setup	Yes	We summarize our hyperparameters used during pretraining and fine-tuning/training in Table 3. ... Table 3: Hyperparameters for pretraining and training/fine-tuning. N.S. : 2D Incompressible Navier-Stokes. DAdapt : adaptive learning rate by D-adaptation [10]. ns : total number of simulated training samples. A batch size of min(32, ns) is because the total number of training samples might be fewer than 32.