reproducibilityindex.ai

Reading selectively via Binary Input Gated Recurrent Unit

Authors: Zhe Li, Peisong Wang, Hanqing Lu, Jian Cheng

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct document classiﬁcation task and language modeling task on 6 different datasets to verify our model and our model achieves better performance.
Researcher Affiliation	Academia	Zhe Li1 , Peisong Wang1,2 , Hanqing Lu1 and Jian Cheng1,2 1National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences 2Center for Excellence in Brain Science and Intelligence Technology
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code.
Open Datasets	Yes	We conduct document classiﬁcation task and language modeling task on 6 different datasets... including Stanford Sentiment Treebank (SST), IMDb, AGNews and DBPedia. ... Penn Treebank (PTB) and Wiki Text-2 dataset.
Dataset Splits	Yes	Table 1: Statistics of the classiﬁcation datasets that BIGRU is evaluated on, where SST refers to Stanford Sentiment Treebank. SST Sentiment Analysis Pos/Neg 6,920 / 872 / 1821 ... IMDb Sentiment Analysis Pos/Neg 21,143 / 3,857 / 25,000 ... AGNews News Classiﬁcation 4 categories 101,851 / 18,149 / 7,600 ... DBPedia Topic Classiﬁcation 14 categories 475,999 / 84,000 / 69,999
Hardware Specification	No	The paper does not provide specific hardware details used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	For both GRU and BIGRU, we use a stacked three-layer RNN. Each word is embedded into a 100-dimensional vector. All models are trained with Adam, with the initial learning rate of 0.0001. We set gradient clip to 2.0. We use batch size of 32 for SST and 128 for the remaining. For both models, we set an early stop if the validation accuracy does not increase for 1000 global steps. ... We use an initial learning rate of 10 for all experiments and carry out gradient clipping with maximum norm 0.25. We use a batch size of 80 for Wiki Text-2 and 40 for PTB. We train 1000 epochs for the small model and 2000 epochs for the large model.