Leap-LSTM: Enhancing Long Short-Term Memory for Text Categorization
Authors: Ting Huang, Gehui Shen, Zhi-Hong Deng
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate Leap-LSTM on several text categorization tasks: sentiment analysis, news categorization, ontology classification and topic classification, with five benchmark data sets. The experimental results show that our model reads faster and predicts better than standard LSTM. |
| Researcher Affiliation | Academia | Key Laboratory of Machine Perception (Ministry of Education), School of Electronics Engineering and Computer Science, Peking University {ht1221, jueliangguke, zhdeng}@pku.edu.cn |
| Pseudocode | No | The paper describes the model architecture and equations but does not provide any pseudocode or a clearly labeled algorithm block. |
| Open Source Code | Yes | We provide a github link https://github.com/Anonymized User/appendixfor-leap-LSTM. |
| Open Datasets | Yes | We use five freely available large-scale data sets introduced by [Zhang et al., 2015], which cover several classification tasks (see Table 1). |
| Dataset Splits | Yes | For each data set, we randomly select 10% of the training set as the development set for hyperparameter selection and early stopping. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for the experiments, such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions using GloVe embeddings and Adam optimizer but does not specify software versions for libraries or frameworks like Python, PyTorch, TensorFlow, etc. |
| Experiment Setup | Yes | Dimensions {h, d, p, f, s, h } are set to {300, 300, 300, 200, 20, 20} respectively. The sizes of CNN filters are {[3, 300, 1, 60], [4, 300, 1, 60], [5, 300, 1, 60]}. The temperature τ is always 0.1. For λ and rt, the hyperparameters of the penalty term, different settings are applied, which depends on our desired skip rate. Throughout our experiments, we use a size of 32 for minibatches. We use Adam [Kingma and Ba, 2014] to optimize all trainable parameters with a initial learning rate 0.001. |