Large Scaled Relation Extraction With Reinforcement Learning

Authors: Xiangrong Zeng, Shizhu He, Kang Liu, Jun Zhao

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct two types of experiments on a publicly released dataset. Experiment results demonstrate the effectiveness of the proposed method compared with baseline models, which achieves 13.36% improvement. To evaluate the effectiveness of our extractor in large scaled data, we train and test our model in the distant supervised dataset.
Researcher Affiliation Academia 1University of Chinese Academy of Sciences, Beijing, 100049, China 2National Laboratory of Pattern Recognition (NLPR), Institute of Automation Chinese Academy of Sciences, Beijing, 100190, China
Pseudocode No The paper describes algorithms and methods but does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not provide concrete access to source code for the described methodology. No links or explicit statements about code availability were found.
Open Datasets Yes In this paper, we use the widely used dataset developed by (Riedel, Yao, and Mc Callum 2010)2 to evaluate our model. This dataset is created by aligning Freebase with New York Times (NYT) corpus. From which, 2005-2006 NYT corpus is used for training and 2007 corpus for testing. There are two versions of the dataset. The first version is comparably smaller, we denote the first version dataset as SMALL while the second as LARGE. The SMALL dataset has two versions too. The original version of SMALL dataset is used by (Riedel, Yao, and Mc Callum 2010), (Hoffmann et al. 2011) and (Surdeanu et al. 2012). The filtered version of SMALL dataset is used by (Zeng et al. 2015), (Jiang et al. 2016) and (Ji et al. 2017). (Zeng et al. 2015) filtered the origin dataset by removing a) duplicated sentences in each bag; b) sentences which have more than 40 tokens between two entities; c) sentences with entity names that are substrings of other entity names in Freebase. In this paper, we use the filtered version of SMALL dataset3. The LARGE dataset is used and released by (Lin et al. 2016)4. It has a much larger training data size than SMALL. We represent the detailed statistics of SMALL and LARGE dataset in Table1. The Bag with entity pair that indicates none NA relation is called positive Bag. For each dataset, we randomly select 20% training data as validation for tuning the parameters of our model and use the rest to train the models.
Dataset Splits Yes For each dataset, we randomly select 20% training data as validation for tuning the parameters of our model and use the rest to train the models.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions software like 'word2vec toolkit' and 'Adam' but does not specify version numbers for any software dependencies.
Experiment Setup Yes The dimension of word embedding |vw| is set to 50, the dimension of position embedding |vp| is set to 5, the window size w of filters is 3 and the number of filters K is set to 230. The batch size is fixed to 50 and dropout probability fix to 0.5. When training, we use Adam (Kingma and Ba 2015) to optimize parameters. It s worth to mention that pretraining is important in reinforcement learning. The learning rate is reset to 0.0001 and batch size is set to 2000 when applying reinforcement learning.