reproducibilityindex.ai

Towards Foundation Models for Knowledge Graph Reasoning

Authors: Mikhail Galkin, Xinyu Yuan, Hesham Mostafa, Jian Tang, Zhaocheng Zhu

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the qualities of ULTRA as a foundation model for KG reasoning, we explore the following questions: (1) Is pre-trained ULTRA able to inductively generalize to unseen KGs in the zero-shot manner? (2) Are there any benefits from fine-tuning ULTRA on a specific dataset? (3) How does a single pre-trained ULTRA model compare to models trained from scratch on each target dataset? (4) Do more graphs in the pre-training mix correspond to better performance?
Researcher Affiliation	Collaboration	Mikhail Galkin1 , Xinyu Yuan2,3, Hesham Mostafa1, Jian Tang2,4, Zhaocheng Zhu2,3 1Intel AI Lab, 2Mila 3University of Montréal 4HEC Montréal & CIFAR AI Chair
Pseudocode	No	The paper describes its algorithm in prose and provides a diagram (Figure 3) but does not include a formal pseudocode block or algorithm listing.
Open Source Code	Yes	The code is available: https://github.com/Deep Graph Learning/ULTRA.
Open Datasets	Yes	We conduct a broad evaluation on 57 different KGs with reported, non-saturated results on the KG completion task. The datasets can be categorized into three groups: Transductive datasets (16 graphs) with the fixed set of entities and relations at training and inference time (Gtrain = Ginf): FB15k-237 (Toutanova & Chen, 2015), WN18RR (Dettmers et al., 2018), YAGO3-10 (Mahdisoltani et al., 2014), NELL-995 (Xiong et al., 2017), Co DEx (Small, Medium, and Large) (Safavi & Koutra, 2020), WDsinger, NELL23k, FB15k237(10), FB15k237(20), FB15k237(50) (Lv et al., 2020), Aristo V4 (Chen et al., 2021), DBpedia100k (Ding et al., 2018), Concept Net100k (Malaviya et al., 2020), Hetionet (Himmelstein et al., 2017) Inductive entity (e) datasets (18 graphs)... Inductive entity and relation (e, r) datasets (23 graphs)...
Dataset Splits	Yes	In the fine-tuning case, we further train the model on the training split of each dataset retaining the checkpoint of the best validation set MRR. We run zero-shot inference experiments once as the results are deterministic, and report an average of 5 runs for each fine-tuning run on each dataset.
Hardware Specification	Yes	ULTRA is relatively small (177k parameters in total, with 60k parameters in GNNr and 117k parameters in GNNe) and is trained for 200,000 steps with batch size of 64 with Adam W optimizer on 2 A100 (40 GB) GPUs. All fine-tuning experiments were done on a single RTX 3090 GPU.
Software Dependencies	No	The paper mentions 'Adam W optimizer' and states it uses 'NBFNet' architecture, but it does not specify version numbers for any software libraries, programming languages, or other dependencies.
Experiment Setup	Yes	ULTRA is trained for 200,000 steps with batch size of 64 with Adam W optimizer and learning rate of 0.0005. Each batch contains only one graph and training samples from this graph. The sampling probability of the graph in the mixture is proportional to the number of edges in this training graph.