Self-Normalizing Neural Networks
Authors: Günter Klambauer, Thomas Unterthiner, Andreas Mayr, Sepp Hochreiter
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compared SNNs on (a) 121 tasks from the UCI machine learning repository, on (b) drug discovery benchmarks, and on (c) astronomy tasks with standard FNNs, and other machine learning methods such as random forests and support vector machines. |
| Researcher Affiliation | Academia | Günter Klambauer Thomas Unterthiner Andreas Mayr Sepp Hochreiter LIT AI Lab & Institute of Bioinformatics, Johannes Kepler University Linz A-4040 Linz, Austria {klambauer,unterthiner,mayr,hochreit}@bioinf.jku.at |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of its source code. |
| Open Datasets | Yes | We compared SNNs on (a) 121 tasks from the UCI machine learning repository, on (b) drug discovery benchmarks, and on (c) astronomy tasks... The Tox21 challenge dataset comprises about 12,000 chemical compounds... Recently, the High Time Resolution Universe Survey (HTRU2) dataset has been released with 1,639 real pulsars and 16,259 spurious signals. |
| Dataset Splits | Yes | Hyperparameters such as number of layers (blocks), neurons per layer, learning rate, and dropout rate, are adjusted by grid-search for each dataset on a separate validation set (see Supplementary Section S4). We used the validation sets of the challenge winners for hyperparameter selection (see Supplementary Section S4)... We assessed the performance of FNNs using 10-fold nested cross-validation, where the hyperparameters were selected in the inner loop on a validation set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software names with version numbers for reproducibility. |
| Experiment Setup | Yes | Hyperparameters such as number of layers (blocks), neurons per layer, learning rate, and dropout rate, are adjusted by grid-search for each dataset on a separate validation set (see Supplementary Section S4). SELU activation with parameters λ 1.0507 and α 1.6733, inputs normalized to zero mean and unit variance, network weights initialized with variance 1/n, and regularization with alpha-dropout. Empirically, we found that dropout rates 1 q = 0.05 or 0.10 lead to models with good performance. |