Adversarial Dropout for Recurrent Neural Networks
Authors: Sungrae Park, Kyungwoo Song, Mingi Ji, Wonsung Lee, Il-Chul Moon4699-4706
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | According to our experiments, adversarial dropout for RNNs showed the advanced performances on the sequential versions of MNIST, the semi-supervised text classification, and the language modeling tasks. ... Table 1 shows the test performances of the dropout-based regularizations. |
| Researcher Affiliation | Collaboration | Sungrae Park,1 Kyungwoo Song,2 Mingi Ji,2 Wonsung Lee,3 Il-Chul Moon2 1Clova AI Research, NAVER Corp., Korea 2Industrial & Systems Engineering, KAIST, Korea 3AI Center, SK Telecom, Korea |
| Pseudocode | No | The paper describes algorithmic steps and refers to a 'detail algorithm in appendix' but does not present a clearly labeled pseudocode or algorithm block within the provided text. |
| Open Source Code | Yes | Our implementation code will be available at https://github.com/sungraepark/adversarial dropout text classification. ... Our implementation code will be available at https://github.com/sungraepark/adversarial dropout lm. |
| Open Datasets | Yes | Sequential MNIST tasks, also known as pixel-by-pixel MNIST... IMDB is a standard benchmark movie review dataset... Elec is a dataset on electronic product reviews from Amazon... the Penn Treebank (PTB)... and Wiki Text-2 (WT2) dataset... |
| Dataset Splits | Yes | These hyperparameters of the baseline models as well as our models were retrieved in the validation phase. ... Table 3 shows the perplexity on both the PTB and Wiki Text-2 validation and test datasets. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., CPU model, GPU model, memory size) used to conduct the experiments. |
| Software Dependencies | No | The paper mentions implementing models but does not list specific software dependencies (e.g., libraries, frameworks) along with their version numbers. |
| Experiment Setup | Yes | For the settings for the dropout, we set the dropout probability as 0.1 for the baseline models. In the case of the adversarial dropout, we adapted ϵ0 = Eϵ[ϵ], which indicates the expectation of the dropout mask, and δ = 0.03, which represents the maximum changes from the base dropout mask as 3%. These hyperparameters of the baseline models as well as our models were retrieved in the validation phase. All models were trained with the same optimizer (detail settings in appendix). |