Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Towards Size-Independent Generalization Bounds for Deep Operator Nets

Authors: Pulkit Gopalani, Sayar Karmakar, Dibyakanti Kumar, Anirbit Mukherjee

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 4.3, we undertake an empirical study on a particular component of the generalization bound, represented as Cn,n 1 m ,( n 1 j=2 C j, j), in relation to the generalization gap to assess the correlation between these factors. Further, our experiments will demonstrate that the complexity measures of Deep ONets as found by our Rademacher analysis indeed correlate to the true generalization gap, over varying sizes of the training data.
Researcher Affiliation Academia Pulkit Gopalani EMAIL Department of Computer Science & Engineering University of Michigan Sayar Karmakar EMAIL Department of Statistics University of Florida Dibyakanti Kumar EMAIL Department of Computer Science The University of Manchester Anirbit Mukherjee EMAIL Department of Computer Science The University of Manchester
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. It describes methodologies and proofs in narrative text and mathematical notation.
Open Source Code Yes Here is the link to the Git Hub repository containing our code for training a Deep ONet for the Heat and Burgers P.D.Es.
Open Datasets No Further, for our experiments, we sampled the input functions ui by sampling functions of the form, ui(x) = N n=1 cn sin((n + 1)x), where the coefficients were sampled as, cn N(0,A2), with A and N being arbitrarily chosen constants. This way of sampling ensures that the functions chosen are all 2π-periodic and will have zero mean in the interval [ π,π] as required in our setup.
Dataset Splits No For our experiments, we fixed Ntraining to 400 and Ptraining is chosen to be the following set of values {300,400,500,750,1000,1500,2000}. Similarly as above the test loss LTest would be defined corresponding to a random sampling of Ntest input functions and Ptest points in the domain of the P.D.E. where the true solution is known corresponding to each of these test input functions.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers like Python 3.8, CPLEX 12.4) needed to replicate the experiment.
Experiment Setup Yes We use branch and trunk nets each of which are of depth 3 and width 100. For our experiments, we fixed Ntraining to 400 and Ptraining is chosen to be the following set of values {300,400,500,750,1000,1500,2000}. The correlation plots shown in Figures 2a and 2b, correspond to training on the Huber loss function, with two different values of δ and the former corresponds to the special value of δ = 1/4 where the size-independent generalization bound in Theorem 4.2 clicks for this setup. We have chosen q = 128 and m = 100 for our experiments. We use branch and trunk nets each of which are of depth 3 and width 128.