Combining Fact Extraction and Verification with Neural Semantic Matching Networks
Authors: Yixin Nie, Haonan Chen, Mohit Bansal6859-6866
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the FEVER dataset indicate that (1) our neural semantic matching method outperforms popular TF-IDF and encoder models, by significant margins on all evidence retrieval metrics, (2) the additional relatedness score and Word Net features improve the NLI model via better semantic awareness, and (3) by formalizing all three subtasks as a similar semantic matching problem and improving on all three stages, the complete model is able to achieve the state-of-the-art results on the FEVER test set (two times greater than baseline results). |
| Researcher Affiliation | Academia | Yixin Nie, Haonan Chen, Mohit Bansal Department of Computer Science University of North Carolina at Chapel Hill {yixin1, haonanchen, mbansal}@cs.unc.edu |
| Pseudocode | No | The paper describes its Neural Semantic Matching Network architecture in detail with equations but does not present a formal pseudocode block or algorithm. |
| Open Source Code | Yes | Code: https://github.com/easonnie/combine-FEVER-NSMN |
| Open Datasets | Yes | The recently-released FEVER dataset introduced a benchmark factverification task in which a system is asked to verify a claim using evidential sentences from Wikipedia documents. The recent release of the Fact Extraction and VERification (Thorne et al. 2018) (FEVER) dataset not only provides valuable fuel for applying data-driven neural approaches on evidence retrieval and claim verification, but also introduces a standardized, benchmark task of the automatic fact checking. |
| Dataset Splits | Yes | The dataset details (size and splits) are discussed in Thorne et al. (2018). In Table 2, we compare the performance of different methods for document retrieval on the entire dev set and on a difficult subset of the dev set. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments (e.g., CPU, GPU models, or memory). |
| Software Dependencies | No | The paper mentions software components like Adam optimizer, GloVe, ELMo, and WordNet, but does not provide specific version numbers for any of the software dependencies or libraries used. |
| Experiment Setup | Yes | We used Adam optimizer (Kingma and Ba 2015) with a batch size of 128. |