Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Size Lowerbounds for Deep Operator Networks

Authors: Anirbit Mukherjee, Amartya Roy

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This inspires our experiments with Deep ONets solving the advection-diffusion-reaction PDE, where we demonstrate the possibility that at a fixed model size, to leverage increase in this common output dimension and get monotonic lowering of training error, the size of the training data might necessarily need to scale at least quadratically with it. In Section 7 we give an experimental demonstration revealing a property of Deep ONets about how much training data is required to leverage any increase in the common output dimension of the branch and the trunk. We created 10 Deep Onet models in each experimental setting such that each model has a depth of 5 and width varying between 24 and 50 for each layer while keeping the total number of training parameters approximately equal for each of those 10 models. For each case the branch input dimension is 40(i.e number of sensor points), and trunk input dimension is 2. The smallest number of training data (n) we use is 104 and twice we make a choice of 10 different (q,n) values parameterizing the learning setups, once keeping the ratio q n approximately constant and then holding the ratio q n 2 3 almost fixed. All the Deep ONet models were trained by the stochastic Adam optimizer at its default parameters.
Researcher Affiliation Collaboration Anirbit Mukherjee EMAIL Department of Computer Science The University of Manchester Amartya Roy EMAIL Robert Bosch GmbH, Coimbatore, India
Pseudocode No The paper describes mathematical proofs and experimental setup in prose, without presenting any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code for this experiment can be found in our Git Hub repository (link).
Open Datasets No For sampling f we have considered the Gaussian random field(GRF) distribution. Here we have used the mean-zero GRF, f G (0,kl (x1,x2)) where the covariance kernel kl (x1,x2) = exp( x1 x2 2 /2l2) is the radial-basis function (RBF) kernel with a length-scale parameter l > 0. For our experiments we have taken l = 10 3. After sampling f from the chosen function spaces, we solve the PDE by a second-order finite difference method to obtain the reference solutions.
Dataset Splits No For n training data samples, the ℓ2 empirical loss being minimized is, ˆLDeep ONet = 1 n n i=1 (yi Gθ (fi)(pi))2, where pi is a randomly sampled point in the (x,t) space and yi is the approximate PDE solution at pi corresponding to fi which we recall was obtained from a conventional solver. The paper discusses training loss but does not explicitly describe how the data is split into training, validation, or test sets.
Hardware Specification No The paper does not provide specific details about the hardware used to run the experiments.
Software Dependencies No All the Deep ONet models were trained by the stochastic Adam optimizer at its default parameters. This mentions an optimizer but lacks specific software library names and version numbers.
Experiment Setup Yes We created 10 Deep Onet models in each experimental setting such that each model has a depth of 5 and width varying between 24 and 50 for each layer while keeping the total number of training parameters approximately equal for each of those 10 models. For each case the branch input dimension is 40(i.e number of sensor points), and trunk input dimension is 2. The smallest number of training data (n) we use is 104 and twice we make a choice of 10 different (q,n) values parameterizing the learning setups, once keeping the ratio q n approximately constant and then holding the ratio q n 2 3 almost fixed. All the Deep ONet models were trained by the stochastic Adam optimizer at its default parameters.