reproducibilityindex.ai

Evidence Inference Networks for Interpretable Claim Verification

Authors: Lianwei Wu, Yuan Rao, Ling Sun, Wangbo He14058-14066

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on two widely used datasets demonstrate that EVIN not only achieves satisfactory performance but also provides explainable evidence for end-users.
Researcher Affiliation	Academia	Lianwei Wu, Yuan Rao, Ling Sun, Wangbo He Xi an Key Lab. of Social Intelligence and Complexity Data Processing, School of Software Engineering, Xi an Jiaotong University, China Shaanxi Joint Key Laboratory for Artifact Intelligence(Sub-Lab of Xi an Jiaotong University), China Research Institute of Xi an Jiaotong University, Shenzhen, China {stayhungry, sunling}@stu.xjtu.edu.cn, raoyuan@mail.xjtu.edu.cn, 744758858@qq.com
Pseudocode	No	The paper includes a system architecture diagram (Figure 2) and describes the model components using mathematical formulas and prose, but it does not contain explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any statement about releasing source code or provide a link to a code repository for the methodology described.
Open Datasets	Yes	We adopt two widely-used competitive datasets released by Popat et al. (2018) for evaluation. Their details are shown as follows: Snopes Dataset. Snopes1 possesses 4,341 claims and corresponding 29,242 relevant articles that include opinions on claims retrieved from 3,267 domains by Bing search API. Each claim in Snopes is labeled as true and false. Politi Fact Dataset. Politi Fact2 has 3,568 claims and 29,556 relevant articles associated with 3,028 domains.
Dataset Splits	Yes	We hold out 10% of the claims in the two datasets as development set for tuning the hyper-parameters, and conduct 5-fold cross-validation on the rest of the claims.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies	No	The paper mentions software like 'Pytorch', 'Theano', and 'BERT-base model' but does not specify any version numbers for these software components.
Experiment Setup	Yes	For parameter conﬁgurations, the pre-trained BERT-base model is used to initialize word embeddings. The size of embeddings is set as 768. The number R of relevant articles varies with different claims. The length k of claim sequence and that of each relevant article are set to 30, and 120, respectively, while the length of the integrated sequence of all relevant articles varies with the number of relevant articles. In self-attention networks, attention heads and blocks are set to 6 and 4, respectively, and the dropout of multi-head attention is set to 0.5. Additionally, the initial learning rate is set to 0.001. We use L2-regularizers with the fully connected layers as well as dropout and set it to 0.6, and the mini-batch size is 64.