Improved Expressivity Through Dendritic Neural Networks

Authors: Xundong Wu, Xiangwen Liu, Wei Li, Qing Wu

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We test DENN models on typical supervised machine learning tasks. It is revealed that our DENN structure can give neural network models a major boost in expressivity under the fixed parameter size and network depth. At the same time, it is empirically shown that the DENN structure can improve generalization performance on certain machine learning tasks. When tested on 121 UCI machine learning repository datasets, DENN models outrank naive standard FNN models.
Researcher Affiliation Academia Xundong Wu Xiangwen Liu Wei Li Qing Wu School of Computer Science and Technology Hangzhou Dianzi University, Hangzhou, China
Pseudocode No The paper describes the DENN model and its components in detail, including mathematical formulations, but it does not provide any pseudocode or algorithm blocks.
Open Source Code Yes Source code: https://github.com/motif Machine/Dendritic-neural-network
Open Datasets Yes We first test our models on permutation-invariant image datasets: the Fashion-MNIST[46], CIFAR-10 and CIFAR-100 datasets[18]... In addition to evaluating the fitting power of DENNs on image datasets, we also evaluate the generalization performance of our DENNs on a collection of 121 machine learning classification tasks from the UCI repository as used in [16, 6, 42].
Dataset Splits Yes The Fashion-MNIST dataset consists of 60,000 training and 10,000 test examples... The CIFAR-10/100 dataset consists of 50,000 training and 10,000 test images... To obtain the benchmark result of a specific standard FNN method, a grid search is performed to search for the best architecture and hyperparameters with a separate validation set.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions using the Adam optimizer, ReLU, Softmax, batch normalization, and layer normalization, but it does not specify any software packages or libraries with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We train all models in comparison with the Adam optimizer [15] for 100 epochs. The learning rate used is decayed exponentially from 0.01 to 1e 5 unless otherwise stated... For the DENN layer, we optimize the model architecture over the number of dendritic branches d for each hidden unit. The value of d is set to be one of 21, 22, , 28 for each model. Correspondingly, the number of active weights in each dendrite branch is set to 512/d. In addition to the regular DENNs models, we also train models with a dropout rate of 0.2.