Hybrid Attentive Answer Selection in CQA With Deep Users Modelling

Authors: Jiahui Wen, Jingwei Ma, Yiliu Feng, Mingyang Zhong

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the proposed model on a public dataset, and demonstrate its advantages over the baselines with thorough experiments.Experiment Dataset We validate the proposed method on Sem Eval-2016 Task 3: Community Question Answering (Alessandro Moschitti et al. 2016), as it s a public dataset containing user ID of each question and answer.
Researcher Affiliation Academia Jiahui Wen, Jingwei Ma, Yiliu Feng, Mingyang Zhong School of Information Technology and Electrical Engineering, The University of Queensland, Australia National University of Defence Technology, China wenjh.nudt@gmail.com, fengyiliu11@nudt.edu.an, {jingwei.ma, m.zhong1}@uq.edu.au
Pseudocode No The paper describes the model in detail with mathematical equations and prose but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not include any explicit statements about the release of source code for the described methodology, nor does it provide any links to a code repository.
Open Datasets Yes We validate the proposed method on Sem Eval-2016 Task 3: Community Question Answering (Alessandro Moschitti et al. 2016), as it s a public dataset containing user ID of each question and answer.
Dataset Splits Yes Train Dev Test Num. of ques. 5319 244 327 Num. of QA pairs 39563 2440 3270 Avg. len. of ques. 45 53 55 Avg. len. of ans. 37 36 37 Table 1: Statistics of Sem Eval-2016 Task 3
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions using "Glove" for word embeddings and "Adam" optimizer but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup Yes The max length of questions and answers are set to 60 and 40 respectively, and all input vectors are padded with 0 to the max length. All hyper-parameters are tuned on the development dataset with grid search. Specifically, the number of LSTM layers for modelling questions and answers is varied in the range [1,4], and the hidden size k of all LSTM layers and the dimension of hidden layers are amongst {32, 64, 128, 256, 512} separately. As for the CNNs that model user information, the filter size is tuned in the range [3, 5] while the number of filter map is set to the hidden size k. For alleviating overfitting, we apply dropout of 0.5 on the output of LSTMs and CNNs. As for the training configuration, the initial learning rate for Adam optimizer are searched in amongst {1e-4, 1e-3, 1e-2}. The batch size is fixed to 64 and the model is trained for a maximum of 100 epochs. We evaluate the model at every epoch and save the parameters for the top model.