A Multi-View Fusion Neural Network for Answer Selection
Authors: Lei Sha, Xiaodong Zhang, Feng Qian, Baobao Chang, Zhifang Sui
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the Wiki QA and Sem Eval-2016 CQA datasets demonstrate that our proposed model outperforms the state-of-the-art methods. |
| Researcher Affiliation | Academia | Lei Sha, Xiaodong Zhang, Feng Qian, Baobao Chang, Zhifang Sui Contributed equally Key Laboratory of Computational Linguistics, Ministry of Education School of Electronics Engineering and Computer Science, Peking University {shalei, zxdcs, nickqian, chbb, szf}@pku.edu.cn |
| Pseudocode | No | The paper describes the model's architecture and calculations using mathematical equations and diagrams, but it does not include a distinct pseudocode block or algorithm. |
| Open Source Code | No | The paper does not include an explicit statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We report the performance of our proposed method on two datasets: Wiki QA (Yang, Yih, and Meek 2015) and Sem Eval-2016 CQA (Nakov et al. 2016). |
| Dataset Splits | Yes | Table 2: The statistics of three answer selection datasets. For Wiki QA, we remove all the questions that has no right answers. Dataset (Train / Dev / Test) Wiki QA: 873 / 126 / 243 # of questions, 20360 / 2733 / 6165 # of answers. Sem Eval-2016 CQA: 4879 / 244 / 327 # of questions, 36198 / 2440 / 3270 # of answers. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper mentions "pre-trained Glo Ve" and "Stanford Corenlp" but does not specify version numbers for these or any other software dependencies, which are required for reproducibility. |
| Experiment Setup | Yes | We use 100-dim word embeddings (d = 100) and we set the hidden layer length dh = 500. The external memory length d M is set to 400. The margin is set to 0.1. To compute the network parameter θ, we maximize the max-margin likelihood J(θ) through stochastic gradient descent over shuffled mini-batches with the Adadelta (Zeiler 2012) update rule. |