Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Prediction

Authors: Jinheon Baek, Dong Bok Lee, Sung Ju Hwang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our model on multiple benchmark datasets for knowledge graph completion and drug-drug interaction prediction. The results show that our model significantly outperforms relevant baselines for out-of-graph link prediction tasks.1
Researcher Affiliation Collaboration Jinheon Baek1, Dong Bok Lee1, Sung Ju Hwang1,2 KAIST1, AITRICS2, South Korea {jinheon.baek, markhi, sjhwang82}@kaist.ac.kr
Pseudocode Yes Algorithm 1 Meta-Learning of GEN
Open Source Code Yes Code is available at https://github.com/Jinheon Baek/GEN
Open Datasets Yes We validate GENs for their OOG link prediction performance on three knowledge graph completion datasets, namely FB15K-237 [42], NELL-995 [55], and WN18RR [9]. We also validate GENs for OOG drug-drug interaction prediction task on Deep DDI [31] and BIOSNAP-sub [63] datasets.
Dataset Splits Yes Then, we generate a task by sampling the set of simulated unseen entities during meta-training, for the learned model to generalize over actual unseen entities (See Figure 1, center). Formally, each task T over a distribution p(T ) corresponds to a set of unseen entities ET E , with a predefined number of instances |ET | = N. Then we divide the triplets associative with each entity e i ET into the support set Si and the query set Qi: T = SN i=1 Si Qi, where Si = {(e i, rj, ej) or ( ej, rj, e i)}K j=1 and Qi = {(e i, rj, ej) or ( ej, rj, e i)}Mi j=K+1; ej (E E ). K is the few-shot size, and Mi is the number of triplets associated with each unseen entity e i. Our meta-objective is then learning to represent the unseen entities as φ using a support set S with a meta-function f, to maximize the triplet score on a query set Q with a score function s as follows: ... Once the model is trained with the meta-training tasks Ttrain, we can apply it to unseen meta-test tasks Ttest, whose set of entities is disjoint from Ttrain, as shown in the center of Figure 1.
Hardware Specification No The paper does not specify the hardware used for running the experiments (e.g., CPU, GPU models, memory, etc.).
Software Dependencies No The paper does not provide specific software dependencies with version numbers.
Experiment Setup Yes To model such uncertainties, we stochastically embed the unseen entities by learning the distribution over an unseen entity embedding φ i. To this end, we first assume that the true posterior distribution has a following form: p(φ i | Si, φ). Since computation of the true posterior distribution is intractable, we approximate the posterior using q (φ i | Si, φ) = N φ i | µi, diag σ2 i , and then compute the mean and variance via two individual transductive GEN layers: µi = gθµ (Si, φ) and σi = gθσ (Si, φ), which modifies the Graph VAE [18] to our setting. The form to maximize the score function s is then defined as follows: s (eh, r, et) = 1 l=1 s eh, r, et; φ (l), θ , φ (l) q(φ | S, φ). (5) where we set the MC sample size to L = 1 during meta-training for computational efficiency. Also, we perform MC approximation with a sufficiently large sample size (e.g. L = 10) at meta-test. ... We then use hinge loss to optimize our model as follows: (eh,r,et) Qi (eh,r,et) Q i max n γ s+(eh, r, et) + s (eh, r, et) , 0 o , (6) where γ > 0 is a margin hyper-parameter... For both I-GEN and T-GEN, we use Dist Mult for the initial embeddings of entities and relations, and the score function. ... To train baselines, we use the Seen to Seen (with Support Set) scheme as in the KG completion task, where support triplets of metavalidation and meta-test sets are included during training. We report detailed experimental setups in the supplementary file.