Reasoning about Entailment with Neural Attention
Authors: Tim Rocktäschel, Edward Grefenstette, Karl Moritz Hermann, Tomáš Kočiský, Phil Blunsom
ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On a large entailment dataset this model outperforms the previous best neural model and a classifier with engineered features by a substantial margin. Our benchmark LSTM achieves an accuracy of 80.9% on SNLI, outperforming a simple lexicalized classifier tailored to RTE by 2.7 percentage points. An extension with word-by-word neural attention surpasses this strong benchmark LSTM result by 2.6 percentage points, setting a new state-of-the-art accuracy of 83.5% for recognizing entailment on SNLI. |
| Researcher Affiliation | Collaboration | Tim Rockt aschel University College London t.rocktaschel@cs.ucl.ac.uk Edward Grefenstette & Karl Moritz Hermann Google Deep Mind {etg,kmh}@google.com Tom aˇs Koˇcisk y & Phil Blunsom Google Deep Mind & University of Oxford {tkocisky,pblunsom}@google.com |
| Pseudocode | No | The paper provides mathematical equations for the LSTM and attention mechanisms but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | We conduct experiments on the Stanford Natural Language Inference corpus (SNLI, Bowman et al., 2015). |
| Dataset Splits | Yes | Subsequently, we take the best configuration based on performance on the validation set, and evaluate only that configuration on the test set. Table 1: Results on the SNLI corpus. Train Dev Test |
| Hardware Specification | No | The paper does not provide specific details on the hardware used for experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | We use ADAM (Kingma and Ba, 2015) for optimization with a first momentum coefficient of 0.9 and a second momentum coefficient of 0.999. (Does not specify software versions, only optimizer name) |
| Experiment Setup | Yes | For every model we perform a small grid search over combinations of the initial learning rate [1E-4, 3E-4, 1E-3], dropout3 [0.0, 0.1, 0.2] and ℓ2regularization strength [0.0, 1E-4, 3E-4, 1E-3]. |