reproducibilityindex.ai

Adversarial Training Methods for Semi-Supervised Text Classification

Authors: Takeru Miyato, Andrew M. Dai, Ian Goodfellow

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed method achieves state of the art results on multiple benchmark semi-supervised and purely supervised tasks. We provide visualizations and analysis showing that the learned word embeddings have improved in quality and that while training, the model is less prone to overﬁtting.
Researcher Affiliation	Collaboration	1 Preferred Networks, Inc., ATR Cognitive Mechanisms Laboratories, Kyoto University 2 Google Brain 3 Open AI
Pseudocode	No	The paper describes the methods textually and mathematically but does not include any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code will be available at https: //github.com/tensorﬂow/models/tree/master/adversarial_text.
Open Datasets	Yes	IMDB (Maas et al., 2011)1 is a standard benchmark movie review dataset for sentiment classiﬁcation. Elec (Johnson & Zhang, 2015b)2 3 is an Amazon electronic product review dataset. Rotten Tomatoes (Pang & Lee, 2005) consists of short snippets of movie reviews, for sentiment classiﬁcation. DBpedia (Lehmann et al., 2015; Zhang et al., 2015) is a dataset of Wikipedia pages for category classiﬁcation. RCV1 (Lewis et al., 2004) consists of news articles from the Reuters Corpus.
Dataset Splits	Yes	For each dataset, we divided the original training set into training set and validation set, and we roughly optimized some hyperparameters shared with all of the methods; (model architecture, batchsize, training steps) with the validation performance of the base model with embedding dropout.
Hardware Specification	No	The paper states "All experiments used Tensor Flow (Abadi et al., 2016) on GPUs" but does not specify any particular GPU model, CPU, or other detailed hardware specifications.
Software Dependencies	No	The paper mentions "All experiments used Tensor Flow (Abadi et al., 2016)" but does not provide specific version numbers for TensorFlow or any other software libraries.
Experiment Setup	Yes	We used a unidirectional single-layer LSTM with 1024 hidden units. The word embedding dimension D was 256 on IMDB and 512 on the other datasets. For the optimization, we used the Adam optimizer (Kingma & Ba, 2015), with batch size 256, an initial learning rate of 0.001, and a 0.9999 learning rate exponential decay factor at each training step. We trained for 100,000 steps. We applied gradient clipping with norm set to 1.0 on all the parameters except word embeddings. For regularization of the recurrent language model, we applied dropout (Srivastava et al., 2014) on the word embedding layer with 0.5 dropout rate.