Hierarchical Inter-Attention Network for Document Classification with Multi-Task Learning

Authors: Bing Tian, Yong Zhang, Jin Wang, Chunxiao Xing

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on 15 public datasets demonstrate the benefits of our proposed model.
Researcher Affiliation Academia 1RIIT, TNList, Dept. of Computer Science and Technology, Tsinghua University, Beijing, China. 2Computer Science Department, University of California, Los Angeles
Pseudocode No The paper describes its methods using text and mathematical equations and provides a network architecture diagram (Figure 2, Figure 3, Figure 4), but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not include any explicit statement about open-sourcing the code for the methodology or provide a link to a code repository.
Open Datasets Yes The first 14 datasets are Amazon product reviews coming from different domains such as Books, Music, Baby, etc. These datasets are collected based on the dataset 1provided by Blitzer et al. [2007]. The last IMDB dataset contains movie reviews with binary classes [Maas et al., 2011]. 1https://www.cs.jhu.edu/ mdredze/datasets/sentiment/
Dataset Splits Yes Following previous studies, we randomly split these datasets into training sets, development sets and testing sets with the proportion of 70%, 10% and 20% respectively.
Hardware Specification No The paper does not explicitly describe the hardware specifications (e.g., specific GPU or CPU models, memory details) used for running the experiments.
Software Dependencies No The paper mentions using GloVe vectors and the Adam optimizer, but it does not specify any software dependencies with version numbers (e.g., Python, TensorFlow/PyTorch versions, or specific library versions).
Experiment Setup Yes The detailed settings of hyper-parameters are shown in Table 2. Word embedding size d = 200 Size of word-level basic-LSTM layer hw = 50 ... Initial learning rate 0.001 Regularization 1E 5