Hierarchical Inter-Attention Network for Document Classification with Multi-Task Learning
Authors: Bing Tian, Yong Zhang, Jin Wang, Chunxiao Xing
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on 15 public datasets demonstrate the benefits of our proposed model. |
| Researcher Affiliation | Academia | 1RIIT, TNList, Dept. of Computer Science and Technology, Tsinghua University, Beijing, China. 2Computer Science Department, University of California, Los Angeles |
| Pseudocode | No | The paper describes its methods using text and mathematical equations and provides a network architecture diagram (Figure 2, Figure 3, Figure 4), but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include any explicit statement about open-sourcing the code for the methodology or provide a link to a code repository. |
| Open Datasets | Yes | The first 14 datasets are Amazon product reviews coming from different domains such as Books, Music, Baby, etc. These datasets are collected based on the dataset 1provided by Blitzer et al. [2007]. The last IMDB dataset contains movie reviews with binary classes [Maas et al., 2011]. 1https://www.cs.jhu.edu/ mdredze/datasets/sentiment/ |
| Dataset Splits | Yes | Following previous studies, we randomly split these datasets into training sets, development sets and testing sets with the proportion of 70%, 10% and 20% respectively. |
| Hardware Specification | No | The paper does not explicitly describe the hardware specifications (e.g., specific GPU or CPU models, memory details) used for running the experiments. |
| Software Dependencies | No | The paper mentions using GloVe vectors and the Adam optimizer, but it does not specify any software dependencies with version numbers (e.g., Python, TensorFlow/PyTorch versions, or specific library versions). |
| Experiment Setup | Yes | The detailed settings of hyper-parameters are shown in Table 2. Word embedding size d = 200 Size of word-level basic-LSTM layer hw = 50 ... Initial learning rate 0.001 Regularization 1E 5 |