reproducibilityindex.ai

TSQA: Tabular Scenario Based Question Answering

Authors: Xiao Li, Yawei Sun, Gong Cheng13297-13305

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To support the study of this task, we construct Geo TSQA. This dataset contains 1k real questions contextualized by tabular scenarios in the geography domain. To solve the task, we extend state-of-the-art MRC methods with TTGen, a novel table-to-text generator. It generates sentences from variously synthesized tabular data and feeds the downstream MRC method with the most useful sentences. Its sentence ranking model fuses the information in the scenario, question, and domain knowledge. Our approach outperforms a variety of strong baseline methods on Geo TSQA. We compared our approach with a variety of strong baseline methods for TSQA. We also evaluated our sentence ranking model, which is the core component of our approach.
Researcher Affiliation	Academia	Xiao Li, Yawei Sun, Gong Cheng State Key Laboratory for Novel Software Technology, Nanjing University, China {xiaoli.nju, ywsun}@smail.nju.edu.cn, gcheng@nju.edu.cn
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Our code and data are available on Git Hub.1 1https://github.com/nju-websoft/TSQA
Open Datasets	Yes	We constructed Geo TSQA. To the best of our knowledge, it is the ﬁrst dataset dedicated to the TSQA task. Our code and data are available on Git Hub.1 1https://github.com/nju-websoft/TSQA
Dataset Splits	Yes	We performed 5-fold cross-validation. For each fold, we split Geo TSQA into 80% for training and 20% for test. For model selection, we relied on an inner holdout 80%/20% training/development split.
Hardware Specification	Yes	We ran all the experiments on TITAN RTX GPUs.
Software Dependencies	No	The paper mentions specific pre-trained models like 'BERT-wwm-ext' and 'CDial GPT2LCCC-base' but does not specify versions for general software libraries or frameworks like Python, PyTorch, or TensorFlow, which are essential for reproducibility.
Experiment Setup	Yes	We set maximum sequence length = 256, self-attention layer = 12, hidden units = 768, epochs = 15 for MRC and template-level ranking, epochs = 5 for sentence-level ranking, batch size = 8 for MRC, batch size = 16 for template-level ranking and sentencelevel ranking, learning rate = 1e 5, and attention heads = 12.