Rational neural networks

Authors: Nicolas Boulle, Yuji Nakatsukasa, Alex Townsend

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The flexibility and smoothness of rational activation functions make them an attractive alternative to Re LU, as we demonstrate with numerical experiments. and The experiments conducted in Section 4 demonstrate the potential applications of these rational networks for solving PDEs and Generative Adversarial Networks (GANs).
Researcher Affiliation Academia Nicolas Boullé Mathematical Institute University of Oxford Oxford, OX2 6GG, UK boulle@maths.ox.ac.uk Yuji Nakatsukasa Mathematical Institute University of Oxford Oxford, OX2 6GG, UK nakatsukasa@maths.ox.ac.uk Alex Townsend Department of Mathematics Cornell University Ithaca, NY 14853, USA townsend@cornell.edu
Pseudocode No No pseudocode or algorithm blocks are present in the paper.
Open Source Code Yes 1All code and hyper-parameters are publicly available at [6]. and Nicolas Boullé, Yuji Nakatsukasa, and Alex Townsend. Git Hub repository. https://github.com/NBoulle/Rational Nets/, 2020.
Open Datasets Yes This section highlights the simplicity of using rational activation functions in existing neural network architectures by training an Auxiliary Classifier GAN (ACGAN) [41] on the MNIST dataset. and They evaluate their model on the MNIST and Imagenet image datasets [12, 30].
Dataset Splits No The paper mentions using a 'validation set' but does not provide specific details on its size, percentage split, or the methodology for creating it.
Hardware Specification No The paper does not provide specific details on the hardware (e.g., GPU models, CPU types, memory) used for running experiments.
Software Dependencies No The paper mentions 'Tensor Flow' and 'Keras' but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes Raissi et al. use an identification network, consisting of 4 layers and 50 nodes per layer, to interpolate samples from a solution to the Kd V equation. and The mean squared error (MSE) of the neural networks on the validation set throughout the training phase is reported in the right panel of Figure 2. ... after 104 epochs and We initialize the activation functions in the training phase with the best rational function that approximates the Re LU function on [ 1, 1].