TabFact: A Large-scale Dataset for Table-based Fact Verification
Authors: Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou, William Yang Wang
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform extensive experiments to investigate their performances: the best-achieved accuracy of both models are reasonable, but far below human performance. |
| Researcher Affiliation | Collaboration | University of California, Santa Barbara, CA, USA Tencent AI Lab, Bellevue, WA, USA |
| Pseudocode | Yes | Algorithm 1 Latent Program Search with Comments |
| Open Source Code | Yes | The data and code of the dataset are provided in https://github.com/wenhuchen/Table-Fact-Checking. |
| Open Datasets | Yes | To this end, we construct a large-scale dataset called Tab Fact with 16k Wikipedia tables as the evidence for 118k human-annotated natural language statements, which are labeled as either ENTAILED or REFUTED. The data and code of the dataset are provided in https://github.com/wenhuchen/Table-Fact-Checking. |
| Dataset Splits | Yes | We split the whole data roughly with 8:1:1 into train, validation7, and test splits and shows their statistics in Table 1. Table 1: ... Val 12,792 |
| Hardware Specification | Yes | We finetune the model on a single TITAN X GPU with a mini-batch size of 6. ... We run the latent program search in a distributed fashion on three 64-core machines |
| Software Dependencies | No | The paper mentions using "open-source implementation of BERT" and "Transformer-based two-way encoder" but does not provide specific version numbers for these or other software libraries/frameworks (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | We finetune the model on a single TITAN X GPU with a mini-batch size of 6. The best performance is reached after about 3 hours of training (around 10K steps). ... For the discriminator model, we design two transformer-based encoders (3 layers, 128-dimension hidden embedding, and 4 heads at each layer) to encode the programs and statements, respectively. |