Towards Foundation Models for Knowledge Graph Reasoning

Authors: Mikhail Galkin, Xinyu Yuan, Hesham Mostafa, Jian Tang, Zhaocheng Zhu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate the qualities of ULTRA as a foundation model for KG reasoning, we explore the following questions: (1) Is pre-trained ULTRA able to inductively generalize to unseen KGs in the zero-shot manner? (2) Are there any benefits from fine-tuning ULTRA on a specific dataset? (3) How does a single pre-trained ULTRA model compare to models trained from scratch on each target dataset? (4) Do more graphs in the pre-training mix correspond to better performance?
Researcher Affiliation Collaboration Mikhail Galkin1 , Xinyu Yuan2,3, Hesham Mostafa1, Jian Tang2,4, Zhaocheng Zhu2,3 1Intel AI Lab, 2Mila 3University of Montréal 4HEC Montréal & CIFAR AI Chair
Pseudocode No The paper describes its algorithm in prose and provides a diagram (Figure 3) but does not include a formal pseudocode block or algorithm listing.
Open Source Code Yes The code is available: https://github.com/Deep Graph Learning/ULTRA.
Open Datasets Yes We conduct a broad evaluation on 57 different KGs with reported, non-saturated results on the KG completion task. The datasets can be categorized into three groups: Transductive datasets (16 graphs) with the fixed set of entities and relations at training and inference time (Gtrain = Ginf): FB15k-237 (Toutanova & Chen, 2015), WN18RR (Dettmers et al., 2018), YAGO3-10 (Mahdisoltani et al., 2014), NELL-995 (Xiong et al., 2017), Co DEx (Small, Medium, and Large) (Safavi & Koutra, 2020), WDsinger, NELL23k, FB15k237(10), FB15k237(20), FB15k237(50) (Lv et al., 2020), Aristo V4 (Chen et al., 2021), DBpedia100k (Ding et al., 2018), Concept Net100k (Malaviya et al., 2020), Hetionet (Himmelstein et al., 2017) Inductive entity (e) datasets (18 graphs)... Inductive entity and relation (e, r) datasets (23 graphs)...
Dataset Splits Yes In the fine-tuning case, we further train the model on the training split of each dataset retaining the checkpoint of the best validation set MRR. We run zero-shot inference experiments once as the results are deterministic, and report an average of 5 runs for each fine-tuning run on each dataset.
Hardware Specification Yes ULTRA is relatively small (177k parameters in total, with 60k parameters in GNNr and 117k parameters in GNNe) and is trained for 200,000 steps with batch size of 64 with Adam W optimizer on 2 A100 (40 GB) GPUs. All fine-tuning experiments were done on a single RTX 3090 GPU.
Software Dependencies No The paper mentions 'Adam W optimizer' and states it uses 'NBFNet' architecture, but it does not specify version numbers for any software libraries, programming languages, or other dependencies.
Experiment Setup Yes ULTRA is trained for 200,000 steps with batch size of 64 with Adam W optimizer and learning rate of 0.0005. Each batch contains only one graph and training samples from this graph. The sampling probability of the graph in the mixture is proportional to the number of edges in this training graph.