Causal Walk: Debiasing Multi-Hop Fact Verification with Front-Door Adjustment
Authors: Congzhi Zhang, Linhai Zhang, Deyu Zhou
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that Causal Walk outperforms some previous debiasing methods on both existing datasets and the newly constructed datasets. |
| Researcher Affiliation | Academia | Congzhi Zhang*, Linhai Zhang*, Deyu Zhou School of Computer Science and Engineering, Key Laboratory of Computer Network and Information Integration, Ministry of Education, Southeast University, China {zhangcongzhi, lzhang472, d.zhou}@seu.edu.cn |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and data will be released at https://github.com/zcccccz/Causal Walk. |
| Open Datasets | Yes | We evaluate the model performance on the FEVER dataset and Politi Hop dataset and their variants. For training, all models are trained on the original training set of FEVER and Politi Hop... FEVER (Thorne et al. 2018) and Politi Hop (Ostrowski et al. 2021) respectively. Code and data will be released at https://github.com/zcccccz/Causal Walk. |
| Dataset Splits | Yes | For training, all models are trained on the original training set of FEVER and Politi Hop. For testing, the developed set of FEVER and the test set of Politi Hop are adopted, denoted as FEVER (Thorne et al. 2018) and Politi Hop (Ostrowski et al. 2021) respectively. |
| Hardware Specification | No | The paper mentions support from 'Big Data Computing Center of Southeast University' but does not specify any particular hardware components like CPU/GPU models, memory, or specific machine configurations used for experiments. |
| Software Dependencies | No | The paper mentions using 'BERTbase' and 'Adam optimizer' but does not provide specific version numbers for any software libraries, frameworks (e.g., PyTorch, TensorFlow), or programming languages. |
| Experiment Setup | Yes | The learning rate is 1e-5. All models are trained for 10 epochs with a batch size of 4. We update the parameters using Adam optimizer. BERT-Concat, CICR, and CLEVER have a maximum input length of 512, and the other models have a maximum input length of 128. The maximum number n of evidence per sample is 20. The beam width w is 3 and the path sampling length m is 5. The number of samples k for each category in the confounder dictionary is 5. The intervention weight parameter α is 0.1. |