Neural Knowledge Acquisition via Mutual Attention Between Knowledge Graph and Text

Authors: Xu Han, Zhiyuan Liu, Maosong Sun

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on relation extraction and entity link prediction show that models trained under our joint framework are significantly improved in comparison with other baselines. We conduct experiments on real-world datasets
Researcher Affiliation Academia 1Department of Computer Science and Technology, State Key Lab on Intelligent Technology and Systems, National Lab for Information Science and Technology, Tsinghua University, Beijing, China 2Beijing Advanced Innovation Center for Imaging Technology, Capital Normal University, Beijing, China Corresponding author: Zhiyuan Liu (liuzy@tsinghua.edu.cn)
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks. Methods are described textually and mathematically.
Open Source Code Yes The source code of this paper can be obtained from https://github.com/thunlp/Joint NRE.
Open Datasets Yes We select Freebase (Bollacker et al. 2008) as the KG for joint learning. In this paper, we adopt datasets extracted from Freebase, FB15K and FB60K in our experiments. We select sentences from the articles of New York Times. We extract 194, 385 sentences containing both head and tail entities in FB15K and annotate with the corresponding relations in triples. We name the corpus NYT-FB15K. The sentences for FB60K come from the dataset used in (Riedel, Yao, and Mc Callum 2010), containing 570, 088 sentences, 63, 696 entities, 56 relations and 293, 175 facts. We name the corpus NYT-FB60K.
Dataset Splits No The paper mentions 'test set' and refers to using benchmarks with their 'previous usage' but does not explicitly define the train/validation/test splits (e.g., percentages or counts) or explicitly mention a 'validation set'.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running its experiments.
Software Dependencies No The paper mentions software components like 'CNN' and 'Skip-Gram' but does not specify their version numbers or list other key libraries with versions, which is necessary for reproducibility.
Experiment Setup Yes In our joint models, we select the learning rate αk for P(G|θE, θR) among {0.1, 0.01, 0.001}, and learning rate αt for P(D|θV ) among {0.1, 0.01, 0.001}. The sliding window size m is among {3, 5, 7}. To compare with previous works, the dimension kw is 50 for RE and 100 for KGC. Table 2 show all parameters used in our experiments. (Table 2 lists Harmonic Factor λ 0.0001, Knowledge Learning Rate αk 0.001, Text Learning Rate αt 0.01, Hidden Layer Dimension kc 230, Word/Entity/Relation Dimension kw 50, Position Dimension kp 5, Window Size m 3, Dropout Probability p 0.5)