Knowledge Graph Embedding With Iterative Guidance From Soft Rules
Authors: Shu Guo, Quan Wang, Lihong Wang, Bin Wang, Li Guo
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate RUGE in link prediction on Freebase and YAGO. Experimental results show that: 1) with rule knowledge injected iteratively, RUGE achieves significant and consistent improvements over state-of-the-art baselines; and 2) despite their uncertainties, automatically extracted soft rules are highly beneficial to KG embedding, even those with moderate confidence levels. |
| Researcher Affiliation | Academia | Shu Guo,1,2 Quan Wang,1,2,3 Lihong Wang,4 Bin Wang,1,2 Li Guo1,2 1Institute of Information Engineering, Chinese Academy of Sciences 2School of Cyber Security, University of Chinese Academy of Sciences 3State Key Laboratory of Information Security, Chinese Academy of Sciences 4National Computer Network Emergency Response Technical Team & Coordination Center of China |
| Pseudocode | Yes | Algorithm 1 summarizes the iterative learning procedure of our approach. |
| Open Source Code | Yes | The code and data used for this paper can be obtained from https://github.com/iieir-km/RUGE. |
| Open Datasets | Yes | We use two datasets: FB15K and YAGO37. The former is a subgraph of Freebase containing 1,345 relations and 14,951 entities, released by Bordes et al. (2013).4 The latter is extracted from the core facts of YAGO3.5 |
| Dataset Splits | Yes | Triples on both datasets are split into training, validation, and test sets, used for model training, hyperparameter tuning, and evaluation, respectively. We use the original split for FB15K, and draw a split of 989,132/50,000/50,000 triples for YAGO37. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running its experiments. |
| Software Dependencies | No | The paper mentions software like Java, SGD with Ada Grad, and AMIE+, but does not specify version numbers for these components (e.g., 'implemented in Java' without a version, or 'SGD with Ada Grad' without mentioning a library and its version). |
| Experiment Setup | Yes | For all the methods, we create 100 mini-batches on each dataset, and tune the embedding dimensionality d in {50, 100, 150, 200}, the number of negatives per positive triple α in {1, 2, 5, 10}, the initial learning rate γ in {0.01, 0.05, 0.1, 0.5, 1.0}, and the L2 regularization coefficient λ in {0.001, 0.003, 0.01, 0.03, 0.1}. For Trans E and its extensions which use the pairwise ranking loss, we further tune the margin δ in {0.1, 0.2, 0.5, 1, 2, 5, 10}. The slackness penalty C in RUGE (cf. Eq. (8)) is selected from {0.001, 0.01, 0.1, 1}, and the number of inner iterations (cf. Eq. (10)) is fixed to τ = 1. Best models are selected by early stopping on the validation set (monitoring MRR), with at most 1000 iterations over the training set. The optimal configurations for RUGE are: d = 200, α = 10, γ = 0.5, λ = 0.01, C = 0.01 on FB15K; and d = 150, α = 10, γ =1.0, λ=0.003, C =0.01 on YAGO37. |