Neural Power Units
Authors: Niklas Heim, Tomas Pevny, Vasek Smidl
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that the NPUs outperform their competitors in terms of accuracy and sparsity on artificial arithmetic datasets, and that the Real NPU can discover the governing equations of a dynamical system only from data. |
| Researcher Affiliation | Academia | Niklas Heim, Tomáš Pevný, Václav Šmídl Artificial Intelligence Center Czech Technical University Prague, CZ 120 00 {niklas.heim, tomas.pevny, vasek.smidl}@aic.fel.cvut.cz |
| Pseudocode | No | The paper provides mathematical definitions and diagrams (Fig. 1, Fig. 2) of the NPU and Naive NPU, but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Implementation of Neural Arithmetic Units: github.com/nmheim/Neural Arithmetic.jl The code to reproduce our experiments is available at github.com/nmheim/Neural Power Units. |
| Open Datasets | No | No concrete access information (specific link, DOI, repository name, formal citation, or reference to established benchmark datasets) for a publicly available or open dataset was provided. The paper describes data generation processes for its experiments (e.g., 'We have numerically simulated one realization of the f SIR model', 'input samples are generated on the fly', 'dataset generation is defined in the set of Eq. 23'). |
| Dataset Splits | No | The paper does not explicitly provide training/validation/test dataset splits. For the fSIR model, it mentions 'training data X' but no validation split. For the simple and large-scale arithmetic tasks, it states inputs are 'generated on the fly during training' and focuses on training and testing without specifying a validation set. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory amounts, or cloud instances) used to run the experiments. |
| Software Dependencies | No | The paper mentions 'Julia packages Flux.jl [Innes et al., 2018] and Differential Equations.jl [Rackauckas and Nie, 2017])' but does not specify their version numbers or other software dependencies with versions. |
| Experiment Setup | Yes | We train each model for 3000 steps with the ADAM optimizer and a learning rate of 0.005, and subsequently with LBFGS until convergence (or for maximum 1000 steps). For each model type, we run a small grid search to build a Pareto front with h {6, 9, 12, 15, 20} and β {0, 0.01, 0.1, 1}. |