reproducibilityindex.ai

Hashing Based Answer Selection

Authors: Dong Xu, Wu-Jun Li9330-9337

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on three popular answer selection datasets show that HAS can outperform existing models to achieve state-of-the-art performance.
Researcher Affiliation	Academia	National Key Laboratory for Novel Software Technology Collaborative Innovation Center of Novel Software Technology and Industrialization Department of Computer Science and Technology, Nanjing University, China
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	We evaluate HAS on three popular answer selection datasets. The statistics about the datasets are presented in Table 1. insurance QA (Feng et al. 2015) is a FAQ dataset from insurance domain. We use the ﬁrst version of this dataset, which has been widely used in existing works (Tan et al. 2016b; Wang, Liu, and Zhao 2016; Tan et al. 2016a; Deng et al. 2018; Tran and Nieder ee 2018). yahoo QA 1 is a large CQA corpus collected from Yahoo! Answers. We adopt the dataset splits as those in (Tay et al. 2017; Tay, Tuan, and Hui 2018a; Deng et al. 2018) for fair comparison. wiki QA (Yang, Yih, and Meek 2015) is a benchmark for open-domain answer selection.
Dataset Splits	Yes	This dataset has already been partitioned into four subsets: Train, Dev, Test1 and Test2.
Hardware Specification	Yes	All experiments are run on a Titan XP GPU.
Software Dependencies	No	The paper mentions using BERT as an encoder and other models, but it does not specify version numbers for these software components or any other libraries used.
Experiment Setup	Yes	More speciﬁcally, the embedding size E and output dimension D of BERT are 768. The probability of dropout is 0.1. Weight decay coefﬁcient is 0.01. Batch size is 64 for yahoo QA, and 32 for insurance QA and wiki QA. The attention hidden size M for insurance QA is 768. M is 128 for yahoo QA and wiki QA. Learning rate is 5e 6 for all models. The numbers of training epoches are 60 for insurance QA, 18 for wiki QA and 9 for yahoo QA. More epoches cannot bring apparent performance gain on the validation set. We evaluate all models on the validation set after each epoch and choose the parameters which achieve the best results on the validation set for ﬁnal test. All reported results are the average of ﬁve runs. There are also two other important parameters, β in tanh(βx) and the coefﬁcient δ of the binary constraint. β is tuned among {1, 2, 5,10, 20}, and δ is tuned among {0, 1e 7, 1e 6, 1e 5, 1e 4}.