reproducibilityindex.ai

Adversarial Active Learning for Sequences Labeling and Generation

Authors: Yue Deng, KaWai Chen, Yilin Shen, Hongxia Jin

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We verify our sequence-based active learning approach on two tasks including sequence labeling and sequence generation. In this part, we investigate the performances of ALISE on two sequence learning tasks including slot ﬁlling and image captioning.
Researcher Affiliation	Collaboration	Yue Deng1, Ka Wai Chen2, Yilin Shen1, Hongxia Jin1 1 AI Center, Samsung Research America, Mountain View, CA, USA 2 Department of Electrical and Computer Engineering, University of California, San Diego
Pseudocode	Yes	Algorithm 1: ALISE Learning
Open Source Code	No	The paper does not provide an explicit statement or link to the open-source code for the described methodology.
Open Datasets	Yes	This part of experiments were mainly conducted on the ATIS (Airline Travel Information Systems) dataset [Hemphill et al., 1990]. This part of active learning experiments are mainly conducted on MSCOCO dataset [Lin et al., 2014].
Dataset Splits	Yes	Among all labeled training samples, we further randomly select 10% of them as validation samples. MSCOCO dataset [Lin et al., 2014], which consists of 82,783 images for training, 40,504 for validation, and 40,775 for testing.
Hardware Specification	Yes	The ALISE training in slot ﬁlling task (with 2700 samples) can be accomplished in just 74 seconds with 16 GPUs (Tesla K80) parallelized in optimization.
Software Dependencies	No	The paper mentions software components like "bi-directional LSTM", "standard LSTM decoder", "attention model", "ADAM", and "relu activation", but it does not specify any version numbers for these or underlying libraries/frameworks.
Experiment Setup	Yes	We choose 128 for word embedding layer and 64 hidden states for the encoder LSTM... The adversarial network D is conﬁgured by three dense-connected layers with 128 (input layer), 64 (intermediate layer) and 1 (output layer) units, respectively. The output layer is further connected with a sigmoid function for probabilistic conversion. We use relu activation among all other layers. Each token of the output sequence is coded as a one-hot vector with the hot entry indicating the underlying cateogory of the token. The whole deep learning system was trained by ADAM [Kingma and Ba, 2014].