A Context-Enriched Neural Network Method for Recognizing Lexical Entailment
Authors: Kun Zhang, Enhong Chen, Qi Liu, Chuanren Liu, Guangyi Lv3127
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on five datasets show that our approach significantly improves the performance of automatic RLE in comparison with several state-of-the-art methods. |
| Researcher Affiliation | Academia | School of Computer Science and Technology, University of Science and Technology of China zhkun@mail.ustc.edu.cn, cheneh@ustc.edu.cn, qiliuql@ustc.edu.cn, gylv@mail.ustc.edu.cn Drexel University chuanren.liu@drexel.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We use 5 labeled datasets for evaluation. In the datasets, each data entry contains a word pair (x, y) and a label indicating whether x entails y. [...] Table 1: Summary statistics of five datasets. Dataset #Instances #Positive #Negative Kotlerman2010 2,940 880 2,060 Bless2011 14,547 1,337 13,210 Baroni2012 2770 1,385 1,385 Turney2014 1,692 920 772 Levy2014 12,602 945 11,657 |
| Dataset Splits | Yes | In order to overcome this problem, we first randomly split the vocabulary into train and test words. The word pairs, whose words are only train words or test words, are called train-only or test-only pairs. Then we extract train-only and test-only subsets of each dataset following (Levy et al. 2015). |
| Hardware Specification | No | The paper does not provide specific hardware details used for running its experiments. |
| Software Dependencies | No | The paper mentions tools like 'Stanford Tagger' and 'word2vec' but does not specify software dependencies with version numbers. |
| Experiment Setup | Yes | P, denoting the dimension of word embeddings, is set as 300. [...] To be specific, we use mini-batch to speed up the training process, in which the batch size can be set from 100 to 300. At the back propagate stage, the learning rate is initialized with one value from 1 to 2. Due to the different characteristics of the datasets, their respective batch size and learning rate might be different. Moreover, in order to avoid overfitting, the learning rate is dynamically updated after a period of iterations (usually 100). We halve the learning rate for every specific number of batches until it reaches the user-specified minimum threshold. |