Neural Knowledge Acquisition via Mutual Attention Between Knowledge Graph and Text
Authors: Xu Han, Zhiyuan Liu, Maosong Sun
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on relation extraction and entity link prediction show that models trained under our joint framework are significantly improved in comparison with other baselines. We conduct experiments on real-world datasets |
| Researcher Affiliation | Academia | 1Department of Computer Science and Technology, State Key Lab on Intelligent Technology and Systems, National Lab for Information Science and Technology, Tsinghua University, Beijing, China 2Beijing Advanced Innovation Center for Imaging Technology, Capital Normal University, Beijing, China Corresponding author: Zhiyuan Liu (liuzy@tsinghua.edu.cn) |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. Methods are described textually and mathematically. |
| Open Source Code | Yes | The source code of this paper can be obtained from https://github.com/thunlp/Joint NRE. |
| Open Datasets | Yes | We select Freebase (Bollacker et al. 2008) as the KG for joint learning. In this paper, we adopt datasets extracted from Freebase, FB15K and FB60K in our experiments. We select sentences from the articles of New York Times. We extract 194, 385 sentences containing both head and tail entities in FB15K and annotate with the corresponding relations in triples. We name the corpus NYT-FB15K. The sentences for FB60K come from the dataset used in (Riedel, Yao, and Mc Callum 2010), containing 570, 088 sentences, 63, 696 entities, 56 relations and 293, 175 facts. We name the corpus NYT-FB60K. |
| Dataset Splits | No | The paper mentions 'test set' and refers to using benchmarks with their 'previous usage' but does not explicitly define the train/validation/test splits (e.g., percentages or counts) or explicitly mention a 'validation set'. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like 'CNN' and 'Skip-Gram' but does not specify their version numbers or list other key libraries with versions, which is necessary for reproducibility. |
| Experiment Setup | Yes | In our joint models, we select the learning rate αk for P(G|θE, θR) among {0.1, 0.01, 0.001}, and learning rate αt for P(D|θV ) among {0.1, 0.01, 0.001}. The sliding window size m is among {3, 5, 7}. To compare with previous works, the dimension kw is 50 for RE and 100 for KGC. Table 2 show all parameters used in our experiments. (Table 2 lists Harmonic Factor λ 0.0001, Knowledge Learning Rate αk 0.001, Text Learning Rate αt 0.01, Hidden Layer Dimension kc 230, Word/Entity/Relation Dimension kw 50, Position Dimension kp 5, Window Size m 3, Dropout Probability p 0.5) |