Rational neural networks
Authors: Nicolas Boulle, Yuji Nakatsukasa, Alex Townsend
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The flexibility and smoothness of rational activation functions make them an attractive alternative to Re LU, as we demonstrate with numerical experiments. and The experiments conducted in Section 4 demonstrate the potential applications of these rational networks for solving PDEs and Generative Adversarial Networks (GANs). |
| Researcher Affiliation | Academia | Nicolas Boullé Mathematical Institute University of Oxford Oxford, OX2 6GG, UK boulle@maths.ox.ac.uk Yuji Nakatsukasa Mathematical Institute University of Oxford Oxford, OX2 6GG, UK nakatsukasa@maths.ox.ac.uk Alex Townsend Department of Mathematics Cornell University Ithaca, NY 14853, USA townsend@cornell.edu |
| Pseudocode | No | No pseudocode or algorithm blocks are present in the paper. |
| Open Source Code | Yes | 1All code and hyper-parameters are publicly available at [6]. and Nicolas Boullé, Yuji Nakatsukasa, and Alex Townsend. Git Hub repository. https://github.com/NBoulle/Rational Nets/, 2020. |
| Open Datasets | Yes | This section highlights the simplicity of using rational activation functions in existing neural network architectures by training an Auxiliary Classifier GAN (ACGAN) [41] on the MNIST dataset. and They evaluate their model on the MNIST and Imagenet image datasets [12, 30]. |
| Dataset Splits | No | The paper mentions using a 'validation set' but does not provide specific details on its size, percentage split, or the methodology for creating it. |
| Hardware Specification | No | The paper does not provide specific details on the hardware (e.g., GPU models, CPU types, memory) used for running experiments. |
| Software Dependencies | No | The paper mentions 'Tensor Flow' and 'Keras' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | Raissi et al. use an identification network, consisting of 4 layers and 50 nodes per layer, to interpolate samples from a solution to the Kd V equation. and The mean squared error (MSE) of the neural networks on the validation set throughout the training phase is reported in the right panel of Figure 2. ... after 104 epochs and We initialize the activation functions in the training phase with the best rational function that approximates the Re LU function on [ 1, 1]. |