Neural Networks Trained to Solve Differential Equations Learn General Representations
Authors: Martin Magill, Faisal Qureshi, Hendrick de Haan
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We illustrate this method by studying generality in neural networks trained to solve parametrized boundary value problems based on the Poisson partial differential equation. We find that the first hidden layers are general, and that they learn generalized coordinates over the input domain. Deeper layers are successively more specific. Next, we validate our method against an existing technique that measures layer generality using transfer learning experiments. We find excellent agreement between the two methods, and note that our method is much faster, particularly for continuously-parametrized problems. |
| Researcher Affiliation | Academia | Martin Magill U. of Ontario Inst. of Tech. martin.magill1@uoit.net Faisal Z. Qureshi U. of Ontario Inst. of Tech. faisal.qureshi@uoit.ca Hendrick W. de Haan U. of Ontario Inst. of Tech. hendrick.dehaan@uoit.ca |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | Yes | Finally, we also apply our method to Re LU networks trained on the MNIST dataset [9], and show it is consistent with, and complimentary to, another study of intrinsic dimensionality. ... [9] Yann Le Cun. The MNIST database of handwritten digits. http://yann.lecun.com/exdb/mnist/, 1998. |
| Dataset Splits | No | The paper mentions training data and test accuracies, but does not provide specific train/validation/test dataset splits (e.g., percentages, sample counts, or citations to predefined splits) needed for reproduction. It mentions 'Test accuracies' but not how the data was split. |
| Hardware Specification | No | The paper mentions 'TensorFlow' and '2 hours per network using our methodology and hardware' but does not provide specific details about the hardware (e.g., CPU, GPU models, or memory) used for the experiments. |
| Software Dependencies | No | The paper mentions the use of 'TensorFlow [1]' but does not provide specific version numbers for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | The networks used in this work were all fully-connected with 4 hidden layers of equal width, implemented in Tensor Flow [1]. Activation functions were tanh, except in Section 3.4, where Re LU was used. ... We used widths of 50, 100, 200, and 400; λ values of 0, 0.01, and 0.05; and four random seeds per combination. |