Combining Fact Extraction and Verification with Neural Semantic Matching Networks

Authors: Yixin Nie, Haonan Chen, Mohit Bansal6859-6866

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on the FEVER dataset indicate that (1) our neural semantic matching method outperforms popular TF-IDF and encoder models, by significant margins on all evidence retrieval metrics, (2) the additional relatedness score and Word Net features improve the NLI model via better semantic awareness, and (3) by formalizing all three subtasks as a similar semantic matching problem and improving on all three stages, the complete model is able to achieve the state-of-the-art results on the FEVER test set (two times greater than baseline results).
Researcher Affiliation Academia Yixin Nie, Haonan Chen, Mohit Bansal Department of Computer Science University of North Carolina at Chapel Hill {yixin1, haonanchen, mbansal}@cs.unc.edu
Pseudocode No The paper describes its Neural Semantic Matching Network architecture in detail with equations but does not present a formal pseudocode block or algorithm.
Open Source Code Yes Code: https://github.com/easonnie/combine-FEVER-NSMN
Open Datasets Yes The recently-released FEVER dataset introduced a benchmark factverification task in which a system is asked to verify a claim using evidential sentences from Wikipedia documents. The recent release of the Fact Extraction and VERification (Thorne et al. 2018) (FEVER) dataset not only provides valuable fuel for applying data-driven neural approaches on evidence retrieval and claim verification, but also introduces a standardized, benchmark task of the automatic fact checking.
Dataset Splits Yes The dataset details (size and splits) are discussed in Thorne et al. (2018). In Table 2, we compare the performance of different methods for document retrieval on the entire dev set and on a difficult subset of the dev set.
Hardware Specification No The paper does not specify the hardware used for running the experiments (e.g., CPU, GPU models, or memory).
Software Dependencies No The paper mentions software components like Adam optimizer, GloVe, ELMo, and WordNet, but does not provide specific version numbers for any of the software dependencies or libraries used.
Experiment Setup Yes We used Adam optimizer (Kingma and Ba 2015) with a batch size of 128.