Learning to Query, Reason, and Answer Questions On Ambiguous Texts

Authors: Xiaoxiao Guo, Tim Klinger, Clemens Rosenbaum, Joseph P. Bigus, Murray Campbell, Ban Kawas, Kartik Talamadupula, Gerry Tesauro, Satinder Singh

ICLR 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our architectures on four QRAQ dataset types, and scale the complexity for each along multiple dimensions. We evaluate our methods on four types of datasets described below. Each dataset contains 107,000 QRAQ problems, with 100,000 for training, 2000 for testing, and 5000 for validation.
Researcher Affiliation Collaboration Xiaoxiao Guo Computer Science & Engineering University of Michigan guoxiao@umich.edu Tim Klinger IBM Watson Research Yorktown Heights, NY tklinger@us.ibm.com Clemens Rosenbaum Computer Science UMass Amherst cgbr@cs.umich.edu Joseph P. Bigus, Murray Campbell, Ban Kawas, Kartik Talamadupula, Gerald Tesauro IBM Watson Research Yorktown Heights, NY jbigus,mcam,bkawas,krtalamad,gtesauro@us.ibm.com Satinder Singh Computer Science & Engineering University of Michigan baveja@umich.edu
Pseudocode No The paper describes the control flow and architectures but does not include structured pseudocode or algorithm blocks.
Open Source Code No We will include a detailed description of the simulator and this algorithm when we release the QRAQ datasets to the research community.
Open Datasets No Each dataset contains 107,000 QRAQ problems, with 100,000 for training, 2000 for testing, and 5000 for validation. We will include a detailed description of the simulator and this algorithm when we release the QRAQ datasets to the research community.
Dataset Splits Yes Each dataset contains 107,000 QRAQ problems, with 100,000 for training, 2000 for testing, and 5000 for validation.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments.
Software Dependencies No The paper mentions using 'Adam (Kingma & Ba (2015))' but does not provide version numbers for Adam or any other software dependencies.
Experiment Setup Yes The number of memory hops is fixed to 4. The embedding dimensionality is fixed to 50. ... Specifically, the rewards is +1 for correct final answers, -5 for wrong final answers. We explored five pairs of query reward values for the curriculum: +/-0.01, +/-0.05, +/-0.1, +/-0.5, +/-1, and found that +/-0.05 performed best on a validation set, so that is what we use for our experiments. ... For our experiments, ϵ = 0.1.