Building Joint Spaces for Relation Extraction

Authors: Chang Wang, Liangliang Cao, James Fan

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The proposed method is evaluated both theoretically with a proof for the closed-form solution and experimentally with promising results on both DBpedia and medical relations.
Researcher Affiliation Industry Chang Wang Bridgewater Associates 20 Westport Road, Wilton Connecticut, USA, 06897 Liangliang Cao Yahoo Labs 229 West 43rd Street, New York New York, USA, 10036
Pseudocode No The paper contains a section '3.6 Algorithm' which describes the algorithm steps in prose, but it is not formatted as a pseudocode block or a labeled algorithm figure.
Open Source Code No The paper does not include an explicit statement about releasing source code for its methodology or provide a link to a code repository.
Open Datasets Yes The relation data was extracted from DBpedia [Auer et al., 2007], which contains the examples for thousands of different relations in the format of (relation name, argument 1, argument 2). Our medical corpus has incorporated Wikipedia articles and MEDLINE abstracts (2013 version). The relation data used in this experiment was from UMLS [Lindberg et al., 1993].
Dataset Splits Yes We divided both the positive and negative set into 3 parts: 40% as training set 1, 30% as training set 2 and the remaining 30% as the test set. The training set 2 was also used to learn the new joint space in both the proposed approach and affine matching.
Hardware Specification No The paper does not specify any hardware details (e.g., GPU models, CPU types, memory) used for running the experiments.
Software Dependencies No The paper mentions using 'liblinear package [Fan et al., 2008]' and 'Word2Vec [Mikolov et al., 2013]' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes In all these experiments, the weight for the positive examples was set to 100. All the other parameters were set to the default values. The parameters used in training were: window size=5, and sample rate=1e-5. In Trans E, the learning rate, margin and number of epoches were set to 0.001, 1 and 100. Dimensionality of the latent space was set to 100 for both TRESCAL and Trans E. The default value of µ is 1, which means S1 and S2 (defined in Section 3.3) will be associated with equal weights. If µ = 0, the neighborhood relationship will not be respected. The default value of k in k NN graph construction is 10. In all experiments, we set the desired dimension of the joint space to be 100, i.e. d = 100.