Leap-LSTM: Enhancing Long Short-Term Memory for Text Categorization

Authors: Ting Huang, Gehui Shen, Zhi-Hong Deng

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate Leap-LSTM on several text categorization tasks: sentiment analysis, news categorization, ontology classification and topic classification, with five benchmark data sets. The experimental results show that our model reads faster and predicts better than standard LSTM.
Researcher Affiliation Academia Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University {ht1221, jueliangguke, zhdeng}@pku.edu.cn
Pseudocode No The paper describes the model architecture and equations but does not provide any pseudocode or a clearly labeled algorithm block.
Open Source Code Yes We provide a github link https://github.com/Anonymized User/appendixfor-leap-LSTM.
Open Datasets Yes We use five freely available large-scale data sets introduced by [Zhang et al., 2015], which cover several classification tasks (see Table 1).
Dataset Splits Yes For each data set, we randomly select 10% of the training set as the development set for hyperparameter selection and early stopping.
Hardware Specification No The paper does not provide specific details about the hardware used for the experiments, such as GPU or CPU models.
Software Dependencies No The paper mentions using GloVe embeddings and Adam optimizer but does not specify software versions for libraries or frameworks like Python, PyTorch, TensorFlow, etc.
Experiment Setup Yes Dimensions {h, d, p, f, s, h } are set to {300, 300, 300, 200, 20, 20} respectively. The sizes of CNN filters are {[3, 300, 1, 60], [4, 300, 1, 60], [5, 300, 1, 60]}. The temperature τ is always 0.1. For λ and rt, the hyperparameters of the penalty term, different settings are applied, which depends on our desired skip rate. Throughout our experiments, we use a size of 32 for minibatches. We use Adam [Kingma and Ba, 2014] to optimize all trainable parameters with a initial learning rate 0.001.