Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization
Authors: Hyeonwoo Noh, Tackgeun You, Jonghwan Mun, Bohyung Han
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed training algorithm in various architectures for real world tasks including object recognition [40], visual question answering [39], image captioning [35] and action recognition [8]. These models are chosen for our experiments since they use dropouts actively for regularization. To isolate the effect of the proposed training method, we employ simple models without integrating heuristics for performance improvement (e.g., model ensembles, multi-scaling, etc.) and make hyper-parameters (e.g., type of optimizer, learning rate, batch size, etc.) fixed. |
| Researcher Affiliation | Academia | Hyeonwoo Noh Tackgeun You Jonghwan Mun Bohyung Han Dept. of Computer Science and Engineering, POSTECH, Korea {shgusdngogo,tackgeun.you,choco1916,bhhan}@postech.ac.kr |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using a publicly available implementation of Wide ResNet for experiments ("https://github.com/szagoruyko/wide-residual-networks") and other third-party codes, but does not provide its own code or explicitly state that the code for their proposed method is open source. |
| Open Datasets | Yes | We evaluate the proposed training algorithm in various architectures for real world tasks including object recognition [40], visual question answering [39], image captioning [35] and action recognition [8]... evaluated on CIFAR datasets [19]... We use VQA dataset [2], which is commonly used for the evaluation of VQA algorithms... We use MSCOCO dataset for experiment... We employ a well-known benchmark of action classification, UCF-101 [33], for evaluation... |
| Dataset Splits | Yes | The dataset has three splits for cross validation, and the final performance is calculated by the average accuracy of the three splits. (UCF-101) |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, memory amounts) used for running experiments, only general statements about hyper-parameters being 'fixed'. |
| Software Dependencies | No | The paper mentions models and architectures like 'two-layer LSTM' or 'VGG-16' but does not list specific software dependencies with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, specific CUDA versions). |
| Experiment Setup | Yes | We perform experiments using the wide residual network with widening factor 10 and depth 28... Wide Res Net (depth=28, dropout=0.3) [40]... Wide Res Net (depth=28, dropout=0.5)... with IWSGD (S = 4)... with IWSGD (S = 8)... When we evaluate performance of IWSGD with 5 and 8 samples... |