Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Regularizing Deep Neural Networks by Noise: Its Interpretation and Optimization
Authors: Hyeonwoo Noh, Tackgeun You, Jonghwan Mun, Bohyung Han
NeurIPS 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed training algorithm in various architectures for real world tasks including object recognition [40], visual question answering [39], image captioning [35] and action recognition [8]. These models are chosen for our experiments since they use dropouts actively for regularization. To isolate the effect of the proposed training method, we employ simple models without integrating heuristics for performance improvement (e.g., model ensembles, multi-scaling, etc.) and make hyper-parameters (e.g., type of optimizer, learning rate, batch size, etc.) fixed. |
| Researcher Affiliation | Academia | Hyeonwoo Noh Tackgeun You Jonghwan Mun Bohyung Han Dept. of Computer Science and Engineering, POSTECH, Korea EMAIL |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using a publicly available implementation of Wide ResNet for experiments ("https://github.com/szagoruyko/wide-residual-networks") and other third-party codes, but does not provide its own code or explicitly state that the code for their proposed method is open source. |
| Open Datasets | Yes | We evaluate the proposed training algorithm in various architectures for real world tasks including object recognition [40], visual question answering [39], image captioning [35] and action recognition [8]... evaluated on CIFAR datasets [19]... We use VQA dataset [2], which is commonly used for the evaluation of VQA algorithms... We use MSCOCO dataset for experiment... We employ a well-known benchmark of action classification, UCF-101 [33], for evaluation... |
| Dataset Splits | Yes | The dataset has three splits for cross validation, and the final performance is calculated by the average accuracy of the three splits. (UCF-101) |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU models, CPU types, memory amounts) used for running experiments, only general statements about hyper-parameters being 'fixed'. |
| Software Dependencies | No | The paper mentions models and architectures like 'two-layer LSTM' or 'VGG-16' but does not list specific software dependencies with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, specific CUDA versions). |
| Experiment Setup | Yes | We perform experiments using the wide residual network with widening factor 10 and depth 28... Wide Res Net (depth=28, dropout=0.3) [40]... Wide Res Net (depth=28, dropout=0.5)... with IWSGD (S = 4)... with IWSGD (S = 8)... When we evaluate performance of IWSGD with 5 and 8 samples... |