Convolutional-Match Networks for Question Answering

Authors: Spyridon Samothrakis, Tom Vodopivec, Michael Fairbank, Maria Fasli

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We achieve state-of-the-art results in the b Ab I tasks, outperforming Memory Networks and the Differentiable Neural Computer, both in terms of accuracy and stability (i.e. variance) of results.
Researcher Affiliation Academia 1IADS, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK 2CSEE, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK 3Faculty of Computer and Information Science, University of Ljubljana, Vecna pot 113, Ljubljana, Slovenia
Pseudocode No The paper includes a diagram (Figure 1) illustrating the network architecture but does not provide structured pseudocode or algorithm blocks.
Open Source Code Yes The full code of this work can be found here: https://github.com/ssamot/conv-match4.
Open Datasets Yes In this paper, we focus on the b Ab I dataset [Weston et al., 2015].
Dataset Splits Yes The version of b Ab I used in this paper contains 20 different question answering tasks, each having 10, 000 training examples. The test set of these tasks comprises of 1, 000 examples. When working with the b Ab I dataset, results may be presented for either training a different network/classifier for each task, or for training a single network to solve all tasks together. In this paper, we train one network to solve all tasks together, referred to as joint trainng, meaning the dataset comprises 200, 000 training and 20, 000 testing instances. ... The hyperparameters where chosen using a validation set on the smaller version of the b Ab I tasks.
Hardware Specification Yes The time it took to go through one iteration (i.e. a full sweep over the training dataset) was 760 seconds on an Nvidia 1080.
Software Dependencies No We used Keras (a neural network framework), Theano (a deep learning library) and Python for our implementation. No specific version numbers for Keras, Theano, or Python are provided.
Experiment Setup Yes Each network training phase lasts for 50 epochs, followed by measurements on the (out-of-core-training) test set. ... The dropout rate used was 0.2. The training algorithm used was Adam [Kingma and Ba, 2015], with initial learning rate 0.001. ... Each neural layer had 128 units.