Lattice CNNs for Matching Based Chinese Question Answering
Authors: Yuxuan Lai, Yansong Feng, Xiaohan Yu, Zheng Wang, Kun Xu, Dongyan Zhao6634-6641
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on both document based question answering and knowledge based question answering tasks, and experimental results show that the LCNs models can significantly outperform the state-of-the-art matching models and strong baselines by taking advantages of better ability to distill rich but discriminative information from the word lattice input. |
| Researcher Affiliation | Collaboration | 1Institute of Computer Science and Technology, Peking University, China 2School of Computing and Communications, Lancaster University, UK 3Tencent AI Lab |
| Pseudocode | No | The paper describes mathematical formulations and processes but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a direct link to its own source code implementation. Footnote 4 links to a PDF appendix, and other footnotes link to general project pages for third-party tools (e.g., Keras, TensorFlow, word2vec, Jieba, Stanford segmenter). |
| Open Datasets | Yes | We conduct experiments on two Chinese question answering datasets from NLPCC-2016 evaluation task (Duan 2016). DBQA is a document based question answering dataset. There are 8.8k questions with 182k question-sentence pairs for training and 6k questions with 123k question-sentence pairs in the test set. |
| Dataset Splits | No | The paper mentions training and testing splits, and tuning hyperparameters, but does not explicitly state the size, percentage, or method of a validation split. |
| Hardware Specification | Yes | Environment: CPU, 2*XEON E5-2640 v4. GPU: 1*NVIDIA Ge Force 1080Ti |
| Software Dependencies | No | We implement our models in Keras7 with Tensorflow8 backend. The footnotes for Keras and TensorFlow do not specify version numbers. |
| Experiment Setup | Yes | In each CNN layer, there are 256, 512, and 256 kernels with width 1, 2, and 3, respectively. The size of the hidden layer for MLP is 1024. All activation are Re LU, the dropout rate is 0.5, with a batch size of 64. We optimize with adadelta (Zeiler 2012) with learning rate = 1.0 and decay factor = 0.95. |