Lattice CNNs for Matching Based Chinese Question Answering

Authors: Yuxuan Lai, Yansong Feng, Xiaohan Yu, Zheng Wang, Kun Xu, Dongyan Zhao6634-6641

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct extensive experiments on both document based question answering and knowledge based question answering tasks, and experimental results show that the LCNs models can significantly outperform the state-of-the-art matching models and strong baselines by taking advantages of better ability to distill rich but discriminative information from the word lattice input.
Researcher Affiliation Collaboration 1Institute of Computer Science and Technology, Peking University, China 2School of Computing and Communications, Lancaster University, UK 3Tencent AI Lab
Pseudocode No The paper describes mathematical formulations and processes but does not include any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a direct link to its own source code implementation. Footnote 4 links to a PDF appendix, and other footnotes link to general project pages for third-party tools (e.g., Keras, TensorFlow, word2vec, Jieba, Stanford segmenter).
Open Datasets Yes We conduct experiments on two Chinese question answering datasets from NLPCC-2016 evaluation task (Duan 2016). DBQA is a document based question answering dataset. There are 8.8k questions with 182k question-sentence pairs for training and 6k questions with 123k question-sentence pairs in the test set.
Dataset Splits No The paper mentions training and testing splits, and tuning hyperparameters, but does not explicitly state the size, percentage, or method of a validation split.
Hardware Specification Yes Environment: CPU, 2*XEON E5-2640 v4. GPU: 1*NVIDIA Ge Force 1080Ti
Software Dependencies No We implement our models in Keras7 with Tensorflow8 backend. The footnotes for Keras and TensorFlow do not specify version numbers.
Experiment Setup Yes In each CNN layer, there are 256, 512, and 256 kernels with width 1, 2, and 3, respectively. The size of the hidden layer for MLP is 1024. All activation are Re LU, the dropout rate is 0.5, with a batch size of 64. We optimize with adadelta (Zeiler 2012) with learning rate = 1.0 and decay factor = 0.95.