Reasoning over Hybrid Chain for Table-and-Text Open Domain Question Answering
Authors: Wanjun Zhong, Junjie Huang, Qian Liu, Ming Zhou, Jiahai Wang, Jian Yin, Nan Duan
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our system on OTT-QA, a large-scale table-and-text open-domain question answering benchmark, and our system achieves the state-of-the-art performance. Further analyses illustrate that the explicit hybrid chain offers substantial performance improvement and interpretablity of the intermediate reasoning process, and the chain-centric pre-training boosts the performance on the chain extraction. ... 4 Experiments We conduct experiments to explore the effectiveness of our method from the following aspects: (1) the performance of our overall system on QA; (2) the performance of the hybrid chain extraction model; (3) the ablation study about the pretraining strategy; (4) the comprehensive qualitative analysis. |
| Researcher Affiliation | Collaboration | Wanjun Zhong1 , Junjie Huang3 , Qian Liu3 , Ming Zhou4 , Jiahai Wang1 , Jian Yin1 and Nan Duan2 1 The School of Computer Science and Engineering, Sun Yat-sen University 2 Microsoft Research Asia 3 Beihang University 4 Langboat Technology |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | 1Code is available at https://github.com/zhongwanjun/CARP |
| Open Datasets | Yes | We evaluate our approach on the OTT-QA [Chen et al., 2020a] dataset. OTT-QA is a large-scale table-and-text open-domain question answering benchmark for evaluating open-domain question answering over both tabular and textual knowledge. OTT-QA has over 40K instances and it also provides a corpus collected from Wikipedia with over 400K tables and 6 million passages. |
| Dataset Splits | Yes | Dev Test Models EM F1 EM F1 ... Table 1: Performance of different methods on the dev. set and the blind test set on OTT-QA. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions software components like RoBERTa, BART, Longformer, BLINK, and FAISS, but it does not specify their version numbers. |
| Experiment Setup | No | The paper describes the model architecture and training objectives (e.g., cross-entropy loss, sparse-attention Transformer), but it does not provide specific hyperparameter values such as learning rates, batch sizes, number of epochs, or detailed optimizer settings necessary for replication. |