A Context-Enriched Neural Network Method for Recognizing Lexical Entailment

Authors: Kun Zhang, Enhong Chen, Qi Liu, Chuanren Liu, Guangyi Lv3127

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on five datasets show that our approach significantly improves the performance of automatic RLE in comparison with several state-of-the-art methods.
Researcher Affiliation Academia School of Computer Science and Technology, University of Science and Technology of China zhkun@mail.ustc.edu.cn, cheneh@ustc.edu.cn, qiliuql@ustc.edu.cn, gylv@mail.ustc.edu.cn Drexel University chuanren.liu@drexel.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes We use 5 labeled datasets for evaluation. In the datasets, each data entry contains a word pair (x, y) and a label indicating whether x entails y. [...] Table 1: Summary statistics of five datasets. Dataset #Instances #Positive #Negative Kotlerman2010 2,940 880 2,060 Bless2011 14,547 1,337 13,210 Baroni2012 2770 1,385 1,385 Turney2014 1,692 920 772 Levy2014 12,602 945 11,657
Dataset Splits Yes In order to overcome this problem, we first randomly split the vocabulary into train and test words. The word pairs, whose words are only train words or test words, are called train-only or test-only pairs. Then we extract train-only and test-only subsets of each dataset following (Levy et al. 2015).
Hardware Specification No The paper does not provide specific hardware details used for running its experiments.
Software Dependencies No The paper mentions tools like 'Stanford Tagger' and 'word2vec' but does not specify software dependencies with version numbers.
Experiment Setup Yes P, denoting the dimension of word embeddings, is set as 300. [...] To be specific, we use mini-batch to speed up the training process, in which the batch size can be set from 100 to 300. At the back propagate stage, the learning rate is initialized with one value from 1 to 2. Due to the different characteristics of the datasets, their respective batch size and learning rate might be different. Moreover, in order to avoid overfitting, the learning rate is dynamically updated after a period of iterations (usually 100). We halve the learning rate for every specific number of batches until it reaches the user-specified minimum threshold.