Knowledge Enhanced Hybrid Neural Network for Text Matching
Authors: Yu Wu, Wei Wu, Can Xu, Zhoujun Li
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Evaluation results from extensive experiments on public data sets of question answering and conversation show that KEHNN can significantly outperform state-of-the-art matching models and particularly improve matching accuracy on pairs with long text. |
| Researcher Affiliation | Collaboration | Yu Wu, Wei Wu, Can Xu, Zhoujun Li State Key Lab of Software Development Environment, Beihang University, Beijing, China Microsoft Research, Beijing, China Authors are supported by Adept Mind Scholarship {wuyu,lizj}@buaa.edu.cn {wuwei,can.xu}@microsoft.com |
| Pseudocode | No | The paper describes its methods using mathematical formulations and textual descriptions but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We shared the code of our model at https://github.com/Mark Wu NLP/KEHNN. |
| Open Datasets | Yes | We used a public data set in Sem Eval 2015 (Alessandro Moschitti, Glass, and Randeree 2015), which collects question-answer pairs from Qatar Living Forum and requires to classify the answers into 3 categories (i.e. C = 3 in our model) including good, potential and bad. and We used a public English conversation data set, the Ubuntu Corpus (Lowe et al. 2015), to conduct the experiment. |
| Dataset Splits | Yes | The training set contains 1 million message-response pairs with a ratio 1:1 between positive and negative responses, and both the validation set and the test set have 0.5 million message-response pairs with a ratio 1:9 between positive and negative responses. and Table 2: Statistics of the QA data set #question #answer #answers per question Training 2600 16541 6.36 Dev 300 1645 5.48 Test 329 1976 6.00 |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states 'We implemented all baselines and KEHNN by Theano (Theano Development Team 2016)' but does not provide specific version numbers for Theano or any other software dependencies. |
| Experiment Setup | Yes | For all models, we set the dimensionality of word embedding (i.e., d) as 100 and the maximum text length (i.e., I and J) as 200. In LSTM based models and Bi GRU in our model, we set the dimensionality of hidden states as 100 (i.e., m). We only used one convolution layer and one max-pooling layer in all CNN based models, because we found that the performance of the models did not get better with the number of layers increased. For Arc2, Match Pyramid, MV-LSTM, and KEHNN, we tuned the window size in convolution and pooling in {(2, 2), (3, 3)(4, 4)} and chose (3, 3) finally. The number of feature maps is 8. For Arc1 and CNTN, we selected the window size from {2, 3, 4} and set it as 3 finally. The number of feature maps is 200. In MLP, we tuned the dimensionality of the hidden layer in {50, 200, 400, 800} and set it as 50 finally. We implemented Multi Gran CNN and Add Feature following the settings in the existing literatures. Sx and Sy in KEHNN shared word embeddings, knowledge embeddings, parameters of Bi GRUs, and parameters of the knowledge gates. All tuning was conducted on validation sets. The activation functions in baselines are the same as those in our model. As regularization, we employ early-stopping (Lawrence and Giles 2000) and dropout (Srivastava et al. 2014) with rate of 0.5. We set the initial learning rate and the batch size as 0.01 and 50 respectively. |