Knowledge Graph Embedding With Iterative Guidance From Soft Rules

Authors: Shu Guo, Quan Wang, Lihong Wang, Bin Wang, Li Guo

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate RUGE in link prediction on Freebase and YAGO. Experimental results show that: 1) with rule knowledge injected iteratively, RUGE achieves significant and consistent improvements over state-of-the-art baselines; and 2) despite their uncertainties, automatically extracted soft rules are highly beneficial to KG embedding, even those with moderate confidence levels.
Researcher Affiliation Academia Shu Guo,1,2 Quan Wang,1,2,3 Lihong Wang,4 Bin Wang,1,2 Li Guo1,2 1Institute of Information Engineering, Chinese Academy of Sciences 2School of Cyber Security, University of Chinese Academy of Sciences 3State Key Laboratory of Information Security, Chinese Academy of Sciences 4National Computer Network Emergency Response Technical Team & Coordination Center of China
Pseudocode Yes Algorithm 1 summarizes the iterative learning procedure of our approach.
Open Source Code Yes The code and data used for this paper can be obtained from https://github.com/iieir-km/RUGE.
Open Datasets Yes We use two datasets: FB15K and YAGO37. The former is a subgraph of Freebase containing 1,345 relations and 14,951 entities, released by Bordes et al. (2013).4 The latter is extracted from the core facts of YAGO3.5
Dataset Splits Yes Triples on both datasets are split into training, validation, and test sets, used for model training, hyperparameter tuning, and evaluation, respectively. We use the original split for FB15K, and draw a split of 989,132/50,000/50,000 triples for YAGO37.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running its experiments.
Software Dependencies No The paper mentions software like Java, SGD with Ada Grad, and AMIE+, but does not specify version numbers for these components (e.g., 'implemented in Java' without a version, or 'SGD with Ada Grad' without mentioning a library and its version).
Experiment Setup Yes For all the methods, we create 100 mini-batches on each dataset, and tune the embedding dimensionality d in {50, 100, 150, 200}, the number of negatives per positive triple α in {1, 2, 5, 10}, the initial learning rate γ in {0.01, 0.05, 0.1, 0.5, 1.0}, and the L2 regularization coefficient λ in {0.001, 0.003, 0.01, 0.03, 0.1}. For Trans E and its extensions which use the pairwise ranking loss, we further tune the margin δ in {0.1, 0.2, 0.5, 1, 2, 5, 10}. The slackness penalty C in RUGE (cf. Eq. (8)) is selected from {0.001, 0.01, 0.1, 1}, and the number of inner iterations (cf. Eq. (10)) is fixed to τ = 1. Best models are selected by early stopping on the validation set (monitoring MRR), with at most 1000 iterations over the training set. The optimal configurations for RUGE are: d = 200, α = 10, γ = 0.5, λ = 0.01, C = 0.01 on FB15K; and d = 150, α = 10, γ =1.0, λ=0.003, C =0.01 on YAGO37.