Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Prediction
Authors: Jinheon Baek, Dong Bok Lee, Sung Ju Hwang
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate our model on multiple benchmark datasets for knowledge graph completion and drug-drug interaction prediction. The results show that our model significantly outperforms relevant baselines for out-of-graph link prediction tasks.1 |
| Researcher Affiliation | Collaboration | Jinheon Baek1, Dong Bok Lee1, Sung Ju Hwang1,2 KAIST1, AITRICS2, South Korea {jinheon.baek, markhi, sjhwang82}@kaist.ac.kr |
| Pseudocode | Yes | Algorithm 1 Meta-Learning of GEN |
| Open Source Code | Yes | Code is available at https://github.com/Jinheon Baek/GEN |
| Open Datasets | Yes | We validate GENs for their OOG link prediction performance on three knowledge graph completion datasets, namely FB15K-237 [42], NELL-995 [55], and WN18RR [9]. We also validate GENs for OOG drug-drug interaction prediction task on Deep DDI [31] and BIOSNAP-sub [63] datasets. |
| Dataset Splits | Yes | Then, we generate a task by sampling the set of simulated unseen entities during meta-training, for the learned model to generalize over actual unseen entities (See Figure 1, center). Formally, each task T over a distribution p(T ) corresponds to a set of unseen entities ET E , with a predefined number of instances |ET | = N. Then we divide the triplets associative with each entity e i ET into the support set Si and the query set Qi: T = SN i=1 Si Qi, where Si = {(e i, rj, ej) or ( ej, rj, e i)}K j=1 and Qi = {(e i, rj, ej) or ( ej, rj, e i)}Mi j=K+1; ej (E E ). K is the few-shot size, and Mi is the number of triplets associated with each unseen entity e i. Our meta-objective is then learning to represent the unseen entities as φ using a support set S with a meta-function f, to maximize the triplet score on a query set Q with a score function s as follows: ... Once the model is trained with the meta-training tasks Ttrain, we can apply it to unseen meta-test tasks Ttest, whose set of entities is disjoint from Ttrain, as shown in the center of Figure 1. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments (e.g., CPU, GPU models, memory, etc.). |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | Yes | To model such uncertainties, we stochastically embed the unseen entities by learning the distribution over an unseen entity embedding φ i. To this end, we first assume that the true posterior distribution has a following form: p(φ i | Si, φ). Since computation of the true posterior distribution is intractable, we approximate the posterior using q (φ i | Si, φ) = N φ i | µi, diag σ2 i , and then compute the mean and variance via two individual transductive GEN layers: µi = gθµ (Si, φ) and σi = gθσ (Si, φ), which modifies the Graph VAE [18] to our setting. The form to maximize the score function s is then defined as follows: s (eh, r, et) = 1 l=1 s eh, r, et; φ (l), θ , φ (l) q(φ | S, φ). (5) where we set the MC sample size to L = 1 during meta-training for computational efficiency. Also, we perform MC approximation with a sufficiently large sample size (e.g. L = 10) at meta-test. ... We then use hinge loss to optimize our model as follows: (eh,r,et) Qi (eh,r,et) Q i max n γ s+(eh, r, et) + s (eh, r, et) , 0 o , (6) where γ > 0 is a margin hyper-parameter... For both I-GEN and T-GEN, we use Dist Mult for the initial embeddings of entities and relations, and the score function. ... To train baselines, we use the Seen to Seen (with Support Set) scheme as in the KG completion task, where support triplets of metavalidation and meta-test sets are included during training. We report detailed experimental setups in the supplementary file. |