Neural Networks Trained to Solve Differential Equations Learn General Representations

Authors: Martin Magill, Faisal Qureshi, Hendrick de Haan

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate this method by studying generality in neural networks trained to solve parametrized boundary value problems based on the Poisson partial differential equation. We find that the first hidden layers are general, and that they learn generalized coordinates over the input domain. Deeper layers are successively more specific. Next, we validate our method against an existing technique that measures layer generality using transfer learning experiments. We find excellent agreement between the two methods, and note that our method is much faster, particularly for continuously-parametrized problems.
Researcher Affiliation Academia Martin Magill U. of Ontario Inst. of Tech. martin.magill1@uoit.net Faisal Z. Qureshi U. of Ontario Inst. of Tech. faisal.qureshi@uoit.ca Hendrick W. de Haan U. of Ontario Inst. of Tech. hendrick.dehaan@uoit.ca
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any concrete access to source code for the methodology described.
Open Datasets Yes Finally, we also apply our method to Re LU networks trained on the MNIST dataset [9], and show it is consistent with, and complimentary to, another study of intrinsic dimensionality. ... [9] Yann Le Cun. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/, 1998.
Dataset Splits No The paper mentions training data and test accuracies, but does not provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) needed for reproduction. It mentions 'Test accuracies' but not how the data was split.
Hardware Specification No The paper mentions 'TensorFlow' and '2 hours per network using our methodology and hardware' but does not provide specific details about the hardware (e.g., CPU, GPU models, or memory) used for the experiments.
Software Dependencies No The paper mentions the use of 'TensorFlow [1]' but does not provide specific version numbers for TensorFlow or any other software dependencies.
Experiment Setup Yes The networks used in this work were all fully-connected with 4 hidden layers of equal width, implemented in Tensor Flow [1]. Activation functions were tanh, except in Section 3.4, where Re LU was used. ... We used widths of 50, 100, 200, and 400; λ values of 0, 0.01, and 0.05; and four random seeds per combination.