Deep Neural Nets with Interpolating Function as Output Activation

Authors: Bao Wang, Xiyang Luo, Zhen Li, Wei Zhu, Zuoqiang Shi, Stanley Osher

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate the classification accuracy, efficiency and robustness of the proposed framework, we test the new architecture and algorithm on CIFAR10, CIFAR100 [18], MNIST[20] and SVHN datasets [24]. In all experiments, we apply standard data augmentation that is widely used for the CIFAR datasets [12, 16, 31].
Researcher Affiliation Academia Bao Wang Department of Mathematics University of California, Los Angeles wangbaonj@gmail.com; Xiyang Luo Department of Mathematics University of California, Los Angeles xylmath@gmail.com; Zhen Li Department of Mathematics HKUST, Hong Kong lishen03@gmail.com; Wei Zhu Department of Mathematics Duke University zhu@math.duke.edu; Zuoqiang Shi Department of Mathematics Tsinghua University zqshi@mail.tsinghua.edu.cn; Stanley J. Osher Department of Mathematics University of California, Los Angeles sjo@math.ucla.edu
Pseudocode Yes We summarize the training and testing procedures for the WNLL activated DNNs in Algorithms 1 and 2, respectively. Algorithm 1 DNNs with WNLL as Output Activation: Training Procedure. Algorithm 2 DNNs with WNLL as Output Activation: Testing Procedure.
Open Source Code Yes The algorithm is implemented in Py Torch, and the code is available at https://github.com/ Bao Wang Math/DNN-Data Dependent Activation.
Open Datasets Yes To validate the classification accuracy, efficiency and robustness of the proposed framework, we test the new architecture and algorithm on CIFAR10, CIFAR100 [18], MNIST[20] and SVHN datasets [24].
Dataset Splits No The paper does not explicitly state the use of a separate validation dataset split with specific percentages or counts. It mentions 'training data' and 'testing data', and also 'randomly separate a template, e.g., half of the entire data, from the training set which will be used to perform WNLL interpolation in training WNLL activated DNNs', but this is not a standard validation split.
Hardware Specification Yes All computations are carried out on a machine with a single Nvidia Titan Xp graphics card.
Software Dependencies No The paper states: 'We implement our algorithm on the Py Torch platform [26].' While PyTorch is mentioned, a specific version number for the software dependency is not provided in the text.
Experiment Setup Yes We take two passes alternating steps, i.e., N = 2 in Algorithm. 1. For the linear activation stage (Stage 1), we train the network for n = 400 epochs. For the WNLL stage, we train for n = 5 epochs. In the first pass, the initial learning rate is 0.05 and halved after every 50 epochs in training linear activated DNNs, and 0.0005 when training the WNLL activation. The same Nesterov momentum and weight decay as used in [12, 17] are used for CIFAR and SVHN experiments, respectively. In the second pass, the learning rate is set to be one fifth of the corresponding epochs in the first pass. The batch sizes are 128 and 2000 when training softmax/linear and WNLL activated DNNs, respectively.