reproducibilityindex.ai

Combining Fact Extraction and Verification with Neural Semantic Matching Networks

Authors: Yixin Nie, Haonan Chen, Mohit Bansal6859-6866

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the FEVER dataset indicate that (1) our neural semantic matching method outperforms popular TF-IDF and encoder models, by significant margins on all evidence retrieval metrics, (2) the additional relatedness score and Word Net features improve the NLI model via better semantic awareness, and (3) by formalizing all three subtasks as a similar semantic matching problem and improving on all three stages, the complete model is able to achieve the state-of-the-art results on the FEVER test set (two times greater than baseline results).
Researcher Affiliation	Academia	Yixin Nie, Haonan Chen, Mohit Bansal Department of Computer Science University of North Carolina at Chapel Hill {yixin1, haonanchen, mbansal}@cs.unc.edu
Pseudocode	No	The paper describes its Neural Semantic Matching Network architecture in detail with equations but does not present a formal pseudocode block or algorithm.
Open Source Code	Yes	Code: https://github.com/easonnie/combine-FEVER-NSMN
Open Datasets	Yes	The recently-released FEVER dataset introduced a benchmark factveriﬁcation task in which a system is asked to verify a claim using evidential sentences from Wikipedia documents. The recent release of the Fact Extraction and VERiﬁcation (Thorne et al. 2018) (FEVER) dataset not only provides valuable fuel for applying data-driven neural approaches on evidence retrieval and claim veriﬁcation, but also introduces a standardized, benchmark task of the automatic fact checking.
Dataset Splits	Yes	The dataset details (size and splits) are discussed in Thorne et al. (2018). In Table 2, we compare the performance of different methods for document retrieval on the entire dev set and on a difﬁcult subset of the dev set.
Hardware Specification	No	The paper does not specify the hardware used for running the experiments (e.g., CPU, GPU models, or memory).
Software Dependencies	No	The paper mentions software components like Adam optimizer, GloVe, ELMo, and WordNet, but does not provide specific version numbers for any of the software dependencies or libraries used.
Experiment Setup	Yes	We used Adam optimizer (Kingma and Ba 2015) with a batch size of 128.