Multi-step Retriever-Reader Interaction for Scalable Open-domain Question Answering

Authors: Rajarshi Das, Shehzaad Dhuliawala, Manzil Zaheer, Andrew McCallum

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct analysis and show that iterative interaction helps in retrieving informative paragraphs from the corpus. Finally, we show that our multistep-reasoning framework brings consistent improvement when applied to two widely used reader architectures (DR.QA and BIDAF) on various large open-domain datasets TRIVIAQA-unfiltered, QUASAR-T, SEARCHQA, and SQUAD-open1.
Researcher Affiliation Collaboration Rajarshi Das1, Shehzaad Dhuliawala2, Manzil Zaheer3 & Andrew Mc Callum1 {rajarshi,mccallum}@cs.umass.edu shehzaad.dhuliawala@microsoft.com, manzil@zaheer.ml 1 University of Massachusetts, Amherst, 2 Microsoft Research, Montr eal 3 Google AI, Mountain View
Pseudocode Yes Algorithm 1 Multi-step reasoning for open-domain QA
Open Source Code Yes 1Code and pretrained models are available at https://github.com/rajarshd/Multi-Step-Reasoning
Open Datasets Yes We experiment on the following large open-domain QA datasets (a) TRIVIAQA-unfiltered a version of TRIVIAQA (Joshi et al., 2017) built for open-domain QA. (c) SEARCHQA (Dunn et al., 2017) is another open-domain dataset... (d) QUASAR-T (Dhingra et al., 2017)... (e) SQUAD-open We also experimented on the open domain version of the SQUAD dataset. For fair comparison to baselines, our evidence corpus was created by retrieving the top-5 wikipedia documents as returned by the pipeline of Chen et al. (2017).
Dataset Splits No The paper mentions 'development set' and 'test set' but does not provide specific percentages or counts for training, validation, and test splits across all datasets, nor does it explicitly cite a source for predefined splits.
Hardware Specification Yes To test for scalability, we increase the number of paragraphs ranging from 500 to 100 million and test on a single Titan-X GPU.
Software Dependencies No The paper mentions algorithms and frameworks used (e.g., Adam, LSTM, GRU, Dr QA, Bi DAF) but does not provide specific software dependencies with version numbers.
Experiment Setup Yes The number of layers of the bi-directional LSTM encoder is set to three and we use Adam (Kingma & Ba, 2014) for optimization.