Convolutional-Match Networks for Question Answering
Authors: Spyridon Samothrakis, Tom Vodopivec, Michael Fairbank, Maria Fasli
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We achieve state-of-the-art results in the b Ab I tasks, outperforming Memory Networks and the Differentiable Neural Computer, both in terms of accuracy and stability (i.e. variance) of results. |
| Researcher Affiliation | Academia | 1IADS, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK 2CSEE, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, UK 3Faculty of Computer and Information Science, University of Ljubljana, Vecna pot 113, Ljubljana, Slovenia |
| Pseudocode | No | The paper includes a diagram (Figure 1) illustrating the network architecture but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The full code of this work can be found here: https://github.com/ssamot/conv-match4. |
| Open Datasets | Yes | In this paper, we focus on the b Ab I dataset [Weston et al., 2015]. |
| Dataset Splits | Yes | The version of b Ab I used in this paper contains 20 different question answering tasks, each having 10, 000 training examples. The test set of these tasks comprises of 1, 000 examples. When working with the b Ab I dataset, results may be presented for either training a different network/classifier for each task, or for training a single network to solve all tasks together. In this paper, we train one network to solve all tasks together, referred to as joint trainng, meaning the dataset comprises 200, 000 training and 20, 000 testing instances. ... The hyperparameters where chosen using a validation set on the smaller version of the b Ab I tasks. |
| Hardware Specification | Yes | The time it took to go through one iteration (i.e. a full sweep over the training dataset) was 760 seconds on an Nvidia 1080. |
| Software Dependencies | No | We used Keras (a neural network framework), Theano (a deep learning library) and Python for our implementation. No specific version numbers for Keras, Theano, or Python are provided. |
| Experiment Setup | Yes | Each network training phase lasts for 50 epochs, followed by measurements on the (out-of-core-training) test set. ... The dropout rate used was 0.2. The training algorithm used was Adam [Kingma and Ba, 2015], with initial learning rate 0.001. ... Each neural layer had 128 units. |