reproducibilityindex.ai

Assertion-Based QA With Question-Aware Open Information Extraction

Authors: Zhao Yan, Duyu Tang, Nan Duan, Shujie Liu, Wendi Wang, Daxin Jiang, Ming Zhou, Zhoujun Li

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that our approaches have the ability to infer question-aware assertions from a passage. We further evaluate our approaches by incorporating the ABQA results as additional features in passage-based QA. Results on two datasets show that ABQA features signiﬁcantly improve the accuracy on passage-based QA.
Researcher Affiliation	Collaboration	Zhao Yan , Duyu Tang , Nan Duan , Shujie Liu , Wendi Wang , Daxin Jiang , Ming Zhou and Zhoujun Li State Key Lab of Software Development Environment, Beihang University, Beijing, China Microsoft Research, Beijing, China Microsoft, Beijing, China {yanzhao, lizj}@buaa.edu.cn {dutang, nanduan, shujliu, wendw, djiang, mingzhou}@microsoft.com
Pseudocode	No	The paper describes the proposed models and methods using textual descriptions and mathematical equations, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states that the dataset 'Web Assertions' will be released to the community, but there is no explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	To study the ABQA task, we construct a human labeled dataset called Web Assertions. The Web Assertions dataset includes hand-annotated QA labels for 358,427 assertions in 55,960 web passages. We randomly split the Web Assertions dataset into training, development, and test sets with a 80:10:10 split. Results are reported on Wiki QA and MARCO datasets, both of which are suitable to test our ABQA approach as the questions from these dataset are also real user queries from the search engine, which is consistent with the Web Assertions dataset. Wiki QA is a benchmark dataset for answer sentence selection and precisely constructed based on natural language questions and Wikipedia documents. The MARCO dataset is originally constructed for the reading comprehension task, yet also includes manual annotation for passage ranking.
Dataset Splits	Yes	In this experiment, we randomly split the Web Assertions dataset into training, development, and test sets with a 80:10:10 split.
Hardware Specification	No	The paper does not provide any specific details regarding the hardware used for conducting the experiments (e.g., CPU, GPU models, memory, or cloud instance types).
Software Dependencies	No	The paper mentions software components and tools like Claus IE, GRU, Ade Delta, and Lambda MART, but it does not specify any version numbers for these or other software dependencies, which is required for reproducibility.
Experiment Setup	No	The paper describes the general training approach (e.g., 'parameters in Seq2Ast are randomly initialized, and updated with Ade Delta'), but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed system-level training settings.