Explicit Interaction Model towards Text Classification

Authors: Cunxiao Du, Zhaozheng Chen, Fuli Feng, Lei Zhu, Tian Gan, Liqiang Nie6359-6366

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results demonstrate the superiority of the proposed method. We justified the proposed approach on several benchmark datasets including both multilabel and multi-class text classification tasks.
Researcher Affiliation Academia 1Shandong University, No.72 Binhai Road, Jimo,Qingdao, Shandong, China 266237 2National University of Singapore, 13 Computing Drive, Singapore 117417 3Shandong Normal University, No.1 University Road, Changqing Dist., Ji nan, Shandong, China 250358
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes As a byproduct, we have released the codes and parameter settings to facilitate other researches. We release the implementation of our method (including some baselines) and the involved parameter settings to facilitate later researchers1. 1https://github.com/Nonvolatile Memory/AAAI 2019 EXAM .
Open Datasets Yes We used publicly available benchmark datasets from (Zhang, Zhao, and Le Cun 2015) to evaluate EXAM. There are in total 6 text classification datasets, corresponding to sentiment analysis, news classification, question-answer and ontology extraction tasks, respectively. ... We conducted experiments on two different multi-label text classification datasets, named Kan Shan-Cup dataset2 (a benchmark) and Zhihu dataset3, respectively. 2https://biendata.com/competition/zhihu/. 3www.zhihu.com.
Dataset Splits Yes We split 10% samples from the training set as the validation set to perform early stop for our models. ... We separated the dataset into training, validation, and testing with 2,800,000, 20,000, and 180,000 questions, respectively. ... We adopted 3,000,000 samples as the training set, 30,000 samples as validation and 300,000 samples as testing.
Hardware Specification Yes Our models are implemented and trained by MXNet (Chen et al. 2015) with a single NVIDIA TITAN Xp. ... We applied Adam (Kingma and Ba 2014) to optimize models on one NVIDIA TITAN Xp with the batch size of 1000 and the initial learning rate is 0.001.
Software Dependencies No Our models are implemented and trained by MXNet (Chen et al. 2015)... The paper mentions software like MXNet, Adam, and word2vec, but does not provide specific version numbers for these dependencies.
Experiment Setup Yes For the multi-class task, we chose region embedding as the Encoder in EXAM. The region size is 7 and embedding size is 128. We used adam (Kingma and Ba 2014) as the optimizer with the initial learning rate 0.0001 and the batch size is set to 16. As for the aggregation MLP, we set the size of the hidden layer as 2 times interaction feature length. ... We used the matrix trained by word2vec (Mikolov et al. 2013) to initialize the embedding layer, and the embedding size is 256. We adopted GRU as the Encoder, and each GRU Cell has 1,024 hidden states. The accumulated MLP has 60 hidden units. We applied Adam (Kingma and Ba 2014) to optimize models on one NVIDIA TITAN Xp with the batch size of 1000 and the initial learning rate is 0.001.