Jumpout : Improved Dropout for Deep Neural Networks with ReLUs

Authors: Shengjie Wang, Tianyi Zhou, Jeff Bilmes

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Jumpout significantly improves the performance of different neural nets on CIFAR10, CIFAR100, Fashion-MNIST, STL10, SVHN, Image Net-1k, etc. We summarize the experimental results in Table 2 which shows that jumpout consistently (i.e., always) outperforms dropout on all datasets and all the DNNs we tested.
Researcher Affiliation Academia 1Paul G. Allen School of Computer Science & Engineering 2Department of Electrical & Computer Engineering, University of Washington, Seattle, USA.
Pseudocode Yes Algorithm 1: Jumpout Layer with Re LU
Open Source Code No The paper mentions external GitHub repositories (e.g., in footnotes 3 and 4) for models or datasets used, but it does not provide an explicit link to the source code for their proposed method, Jumpout, nor does it state that their code is publicly released.
Open Datasets Yes We apply dropout and jumpout to different popular DNN architectures and compare their performance on six benchmark datasets at different scales. In particular, these DNN architectures include a small CNN... applied to CIFAR10 (Krizhevsky & Hinton, 2009), Wide Res Net-28-10... applied to CIFAR10 and CIFAR100 (Krizhevsky & Hinton, 2009), pre-activation version of Res Net-20 (He et al., 2016b) applied to Fashion-MNIST ( Fashion in all tables) (Xiao et al., 2017), Wide Res Net-16-8 applied to SVHN (Netzer et al., 2011) and STL10 (Coates et al., 2011), and Res Net-18 (He et al., 2016a) applied to Image Net (Deng et al., 2009; Russakovsky et al., 2015).
Dataset Splits Yes On each dataset, we tune the dropout rate and σ in jumpout between [0.05, 0.60] with a step size of 0.05 on a validation set that is 20% of the original training set.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory specifications) used to run the experiments.
Software Dependencies No The paper implicitly suggests the use of PyTorch and MXNet through external GitHub links, but it does not provide specific version numbers for these or any other software dependencies, making the setup non-reproducible from a software perspective.
Experiment Setup Yes On each dataset, we tune the dropout rate and σ in jumpout between [0.05, 0.60] with a step size of 0.05 on a validation set that is 20% of the original training set. In practice, we set pmin = 0.01 and pmax = 0.6, which work consistently well over all datasets and models we tried.