Hashing Based Answer Selection

Authors: Dong Xu, Wu-Jun Li9330-9337

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on three popular answer selection datasets show that HAS can outperform existing models to achieve state-of-the-art performance.
Researcher Affiliation Academia National Key Laboratory for Novel Software Technology Collaborative Innovation Center of Novel Software Technology and Industrialization Department of Computer Science and Technology, Nanjing University, China
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes We evaluate HAS on three popular answer selection datasets. The statistics about the datasets are presented in Table 1. insurance QA (Feng et al. 2015) is a FAQ dataset from insurance domain. We use the first version of this dataset, which has been widely used in existing works (Tan et al. 2016b; Wang, Liu, and Zhao 2016; Tan et al. 2016a; Deng et al. 2018; Tran and Nieder ee 2018). yahoo QA 1 is a large CQA corpus collected from Yahoo! Answers. We adopt the dataset splits as those in (Tay et al. 2017; Tay, Tuan, and Hui 2018a; Deng et al. 2018) for fair comparison. wiki QA (Yang, Yih, and Meek 2015) is a benchmark for open-domain answer selection.
Dataset Splits Yes This dataset has already been partitioned into four subsets: Train, Dev, Test1 and Test2.
Hardware Specification Yes All experiments are run on a Titan XP GPU.
Software Dependencies No The paper mentions using BERT as an encoder and other models, but it does not specify version numbers for these software components or any other libraries used.
Experiment Setup Yes More specifically, the embedding size E and output dimension D of BERT are 768. The probability of dropout is 0.1. Weight decay coefficient is 0.01. Batch size is 64 for yahoo QA, and 32 for insurance QA and wiki QA. The attention hidden size M for insurance QA is 768. M is 128 for yahoo QA and wiki QA. Learning rate is 5e 6 for all models. The numbers of training epoches are 60 for insurance QA, 18 for wiki QA and 9 for yahoo QA. More epoches cannot bring apparent performance gain on the validation set. We evaluate all models on the validation set after each epoch and choose the parameters which achieve the best results on the validation set for final test. All reported results are the average of five runs. There are also two other important parameters, β in tanh(βx) and the coefficient δ of the binary constraint. β is tuned among {1, 2, 5,10, 20}, and δ is tuned among {0, 1e 7, 1e 6, 1e 5, 1e 4}.