Question Decomposition Tree for Answering Complex Questions over Knowledge Bases

Authors: Xiang Huang, Sitao Cheng, Yiheng Shu, Yuheng Bao, Yuzhong Qu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that QDTQA outperforms previous state-of-the-art methods on Complex Web Questions dataset. Besides, our decomposition method improves an existing KBQA system by 11% and sets a new state-of-the-art on LC-Qu AD 1.0.
Researcher Affiliation Academia State Key Laboratory for Novel Software Technology, Nanjing University, China {xianghuang, stcheng, yhshu, yhbao}@smail.nju.edu.cn, yzqu@nju.edu.cn
Pseudocode No The paper describes the methods textually and with a flowchart (Figure 2) but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes 1Our code and dataset are available at https://github.com/cdhx/ QDTQA
Open Datasets Yes The questions in QDTrees are derived from two complex KBQA datasets: Complex Web Questions (CWQ) (Talmor and Berant 2018) and LC-Qu AD 1.0 (LC) (Trivedi et al. 2017).
Dataset Splits Yes For CWQ, we annotate three subsets with 2,000/500/500 questions randomly sampled from the training/validation/test sets, respectively. ... Since LC does not provide an official validation set, we split the training set into a new training set (the first 3,200 questions) and a validation set (the last 800 questions).
Hardware Specification Yes We train our models for 100 epochs on an NVIDIA Ge Force RTX 3090 GPU and save the best checkpoints on the validation set.
Software Dependencies No The paper mentions 'Pytorch' and 'Hugging Face' as frameworks, and 'T5-base' and 'BERT-base' as models, but does not provide specific version numbers for any of these software components.
Experiment Setup Yes The batch sizes for Clue Net and Decipher Net are set to 64. ... The batch size is set to 16 and the max length is set to 196. The entity disambiguation model is based on BERT-base, in which we set batch size to 16 and max length to 96.