Self-Normalizing Neural Networks

Authors: Günter Klambauer, Thomas Unterthiner, Andreas Mayr, Sepp Hochreiter

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compared SNNs on (a) 121 tasks from the UCI machine learning repository, on (b) drug discovery benchmarks, and on (c) astronomy tasks with standard FNNs, and other machine learning methods such as random forests and support vector machines.
Researcher Affiliation Academia Günter Klambauer Thomas Unterthiner Andreas Mayr Sepp Hochreiter LIT AI Lab & Institute of Bioinformatics, Johannes Kepler University Linz A-4040 Linz, Austria {klambauer,unterthiner,mayr,hochreit}@bioinf.jku.at
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link regarding the availability of its source code.
Open Datasets Yes We compared SNNs on (a) 121 tasks from the UCI machine learning repository, on (b) drug discovery benchmarks, and on (c) astronomy tasks... The Tox21 challenge dataset comprises about 12,000 chemical compounds... Recently, the High Time Resolution Universe Survey (HTRU2) dataset has been released with 1,639 real pulsars and 16,259 spurious signals.
Dataset Splits Yes Hyperparameters such as number of layers (blocks), neurons per layer, learning rate, and dropout rate, are adjusted by grid-search for each dataset on a separate validation set (see Supplementary Section S4). We used the validation sets of the challenge winners for hyperparameter selection (see Supplementary Section S4)... We assessed the performance of FNNs using 10-fold nested cross-validation, where the hyperparameters were selected in the inner loop on a validation set.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies No The paper does not provide specific software names with version numbers for reproducibility.
Experiment Setup Yes Hyperparameters such as number of layers (blocks), neurons per layer, learning rate, and dropout rate, are adjusted by grid-search for each dataset on a separate validation set (see Supplementary Section S4). SELU activation with parameters λ 1.0507 and α 1.6733, inputs normalized to zero mean and unit variance, network weights initialized with variance 1/n, and regularization with alpha-dropout. Empirically, we found that dropout rates 1 q = 0.05 or 0.10 lead to models with good performance.