Generative Question Answering: Learning to Answer the Whole Question
Authors: Mike Lewis, Angela Fan
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 3 EXPERIMENTS, Table 1: Exact Match (EM) and F1 on SQUAD, comparing to the best published single models at the time of submission (September 2018). |
| Researcher Affiliation | Industry | Mike Lewis & Angela Fan Facebook AI Research {mikelewis,angelafan}@fb.com |
| Pseudocode | No | No pseudocode or algorithm blocks found in the paper. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We evaluate our model (GQA) on the SQUAD dataset to test its robustness to diverse syntactic and lexical inferences. Results are shown in Table 1... We evaluate the ability of our model to perform multihop reasoning on the CLEVR dataset, which consists of images paired with automatically generated questions involving that test visual reasoning. |
| Dataset Splits | Yes | A correct answer is contained in the beam for over 98.5% of validation questions, suggesting that approximate inference is not a major cause of errors. and The validation set is created from questions whose answers are the named entity type, but there must be multiple occurrences of that type in the document. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions models like ELMo and ResNet-101, but does not provide specific software dependencies with version numbers (e.g., programming language versions, library versions like PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | Yes | Hyperparameters and training details are fully described in Appendix A. (Section 2.1) For example: The encoder contains 2 answer-independent LSTM layers and 3 answer-dependent LSTM layers, all of hidden size 128. The decoder contains 9 blocks, all with hidden size d = 256. We apply dropout (p = 0.55)... We train generatively with batches of 10 documents, using a cosine learning rate schedule with a period of 1 epoch, warming up over the first 5 epochs to a maximum learning rate of 10 4. During fine-tuning... Fine tuning uses stochastic gradient descent with single question batches, learning rate 5 10 5, and momentum 0.97. |