Multi-Hop Fact Checking of Political Claims
Authors: Wojciech Ostrowski, Arnav Arora, Pepa Atanasova, Isabelle Augenstein
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We: 1) construct a small annotated dataset, Politi Hop1, of evidence sentences for claim verification; 2) compare it to existing multi-hop datasets; and 3) study how to transfer knowledge from more extensive inand out-of-domain resources to Politi Hop. We find that the task is complex and achieve the best performance with an architecture that specifically models reasoning over evidence pieces in combination with in-domain transfer learning. |
| Researcher Affiliation | Academia | Wojciech Ostrowski , Arnav Arora , Pepa Atanasova and Isabelle Augenstein Department of Computer Science, University of Copenhagen, Denmark qnj566@alumni.ku.dk, {aar, pepa, augenstein}@di.ku.dk |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. The methodology is described in narrative text. |
| Open Source Code | Yes | We make the Politi Hop dataset and the code for the experiments publicly available on https://github.com/copenlu/politihop . |
| Open Datasets | Yes | We make the Politi Hop dataset and the code for the experiments publicly available on https://github.com/copenlu/politihop . |
| Dataset Splits | Yes | It consists of 500 manually annotated claims in written English, split into a training (300 instances) and a test set (200 instances). ... We split the training data into train and dev datasets, where the former has 592 examples and the latter 141. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments (e.g., specific GPU or CPU models, memory sizes). It mentions models like BERT and Transformer-XH but not the computational resources. |
| Software Dependencies | No | The paper mentions using BERT [Devlin et al., 2019] and Transformer-XH [Zhao et al., 2020] but does not specify version numbers for these or other software dependencies (e.g., Python, PyTorch, TensorFlow, CUDA). |
| Experiment Setup | Yes | In our experiments, we set k = 6 since this is the average number of evidence sentences selected by a single annotator. ... We use three e Xtra hop layers as in [Zhao et al., 2020], which corresponds to three-hop reasoning, and we experiment with varying the number of hops. |