Reinforcement Learning for Relation Classification From Noisy Data

Authors: Jun Feng, Minlie Huang, Li Zhao, Yang Yang, Xiaoyan Zhu

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results show that our model can deal with the noise of data effectively and obtains better performance for relation classification at the sentence level.
Researcher Affiliation Collaboration State Key Lab. of Intelligent Technology and Systems, National Lab. for Information Science and Technology Dept. of Computer Science and Technology, Tsinghua University, Beijing 100084, PR China Microsoft Research Asia College of Computer Science and Technology, Zhejiang University
Pseudocode Yes ALGORITHM 1: Overall Training Procedure ALGORITHM 2: Reinforcement Learning Algorithm for the Instance Selector
Open Source Code No The paper states: "All the baselines were implemented with the source codes released by (Li et al. 2016)." This refers to the code for baselines, not the authors' own source code for their proposed method. There is no explicit statement or link provided for their own code.
Open Datasets Yes To evaluate our model, we adopted a widely used dataset4 generated by the sentences in NYT5 and developed by (Riedel, Yao, and Mc Callum 2010). (...) 4http://iesl.cs.umass.edu/riedel/ecml/
Dataset Splits Yes There are 522,611 sentences, 281,270 entity pairs, and 18,252 relational facts in the training data; and 172,448 sentences, 96,678 entity pairs and 1,950 relational facts in the test data. (...) Similar to previous studies, we tuned our model using three-fold cross validation.
Hardware Specification No The paper does not provide specific hardware details (like exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions "word2vec" and "Trans E model (Bordes et al. 2013)" but does not provide specific version numbers for any software libraries or dependencies.
Experiment Setup Yes For the parameters of the instance selector, we set the dimension of entity embedding as 50, the learning rate as 0.02/0.01 at the pretraining stage and joint training stage respectively. The delay coefficient τ is 0.001. For the parameters of the relation classifier, the word embedding dimension dw = 50 and the position embedding dimension dp = 5. The window size of the convolution layer l is 3. The learning rate of the instance selector is α = 0.02 both at the pre-training and joint training stage. The batch size is fixed to 160. The training episode number L = 25. We employed a dropout strategy with a probability of 0.5 during the training of the CNN component.