reproducibilityindex.ai

Open Question Answering over Tables and Text

Authors: Wenhu Chen, Ming-Wei Chang, Eva Schlinger, William Yang Wang, William W. Cohen

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our baseline model using an iterative retriever and BERT-based reader achieves an exact match score less than 10%. We then propose two novel techniques to address the challenge of retrieving and aggregating evidence for OTT-QA. The ﬁrst technique is to use early fusion to group multiple highly relevant tabular and textual units into a fused block, which provides more context for the retriever to search for. The second technique is to use a cross-block reader to model the cross-dependency between multiple retrieved evidence with global-local sparse attention. Combining these two techniques improves the score signiﬁcantly, to above 27%.
Researcher Affiliation	Collaboration	1University of California, Santa Barbara 2Google Research
Pseudocode	No	No pseudocode or algorithm blocks are present.
Open Source Code	No	The paper mentions data release, but not a clear statement or link for the source code of the models or methods described in the paper.
Open Datasets	Yes	For this purpose, we construct a new dataset, Open Table-and-Text Question Answering (OTT-QA). Data was released in https://github.com/wenhuchen/OTT-QA by UCSB NLP Group
Dataset Splits	Yes	Finally, we have 41,469 questions in the training set, 2,214 questions in the dev set, and 2,158 questions in the test set.
Hardware Specification	Yes	Both are using 16 cloud TPUs.
Software Dependencies	No	The paper mentions TensorFlow and other models like BERT and ETC, but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	All the models are trained with a learning rate of 1e-5 optimized by Adam W (Loshchilov & Hutter, 2019). We use in-batch negatives (Lee et al., 2019) to train our dense retrievers. For all the dense retrievers, we pre-train with 10K steps using the generated pseudo query and then ﬁne-tune them another 10K step using a batch size of 2048. For the cross-block reader, we ﬁne-tune with a batch size of 64.