reproducibilityindex.ai

Jumper: Learning When to Make Classification Decision in Reading

Authors: Xianggen Liu, Lili Mou, Haotian Cui, Zhengdong Lu, Sen Song

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluated JUMPER on three tasks, including two benchmark datasets and one real, industrial application. We show the performance of both ultimate classiﬁcation and jumping positions; we will also have deep analysis into our model.
Researcher Affiliation	Collaboration	1Department of Biomedical Engineering, IDG/Mc Govern Institute for Brain Research, Tsinghua University 2Adept Mind.ai 3Deeply Curious.ai 4Laboratory of Brain and Intelligence, Tsinghua University
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Both code and the Occupational Injury dataset are available at: https://github.com/jumper-data
Open Datasets	Yes	Movie Review (MR), whose objective is a binary sentiment classiﬁcation (positive vs. negative) for movie reviews [Pang and Lee, 2004]; it is widely used as a sentence classiﬁcation task. AG news corpus (AG), which is a collection of more than one million news articles, and we followed Zhang et al. [2015]... Occupational Injury (OI).2 The task information extraction of occupational injury originates from a real industrial application in the legal domain... Both code and the Occupational Injury dataset are available at: https://github.com/jumper-data
Dataset Splits	Yes	We did not perform any dataset-speciﬁc tuning except early stopping on the development sets. For AG, which does not have a standard split, we randomly selected 5% of the training data as the development set.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions several components like CNN, GRU, GloVe vectors, and Ada Delta, but does not provide specific version numbers for any software libraries, frameworks, or programming languages used.
Experiment Setup	Yes	In our model and baselines, the CNN part used rectiﬁed linear units (Re LU) as the activation function, ﬁlter windows with sizes 1 to 5, 200 feature maps for each ﬁlter, and a dropout rate of 0.5; GRU had a hidden size of 20. We reimplemented the self-attentive model using the same hyperparameters as in Lin et al. [2017]. For reinforcement learning, the intermediate reward r was 0.05, discounting rate γ was 0.9, and the exploration rate ϵ was 0.1. In addition, word embeddings for all of the models were initialized with 300d Glo Ve vectors [Pennington et al., 2014] and ﬁne-tuned during training to improve the performance. The other parameters were initialized by randomly sampling from the uniform distribution in [ 0.01, 0.01]. For all the models, we used Ada Delta with a learning rate of 0.1 and a batch size of 50.