reproducibilityindex.ai

KG-FIT: Knowledge Graph Fine-Tuning Upon Open-World Knowledge

Authors: Pengcheng Jiang, Lang Cao, Cao (Danica) Xiao, Parminder Bhatia, Jimeng Sun, Jiawei Han

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the benchmark datasets FB15K-237, YAGO3-10, and Prime KG demonstrate the superiority of KG-FIT over state-of-the-art pre-trained language model-based methods, achieving improvements of 14.4%, 13.5%, and 11.9% in the Hits@10 metric for the link prediction task, respectively. Furthermore, KG-FIT yields substantial performance gains of 12.6%, 6.7%, and 17.7% compared to the structure-based base models upon which it is built.
Researcher Affiliation	Collaboration	University of Illinois at Urbana-Champaign GE Health Care
Pseudocode	Yes	Algorithm 1 Seed Hierarchy Construction; Algorithm 2 LLM-Guided Cluster Splitting; Algorithm 3 LLM-Guided Bottom-Up Hierarchy Refinement
Open Source Code	Yes	Our code and data are available at https://github.com/pat-jj/KG-FIT.
Open Datasets	Yes	FB15k-237 [40] (CC BY 4.0) is a subset of Freebase [41], a large collaborative knowledge base, focusing on common knowledge; (2) YAGO3-10 [42] is a subset of YAGO [43] (CC BY 4.0), which is a large knowledge base derived from multiple sources including Wikipedia, Word Net, and Geo Names; (3) Prime KG [44] (CC0 1.0) is a biomedical KG that integrates 20 biomedical resources
Dataset Splits	Yes	Table 2: Datasets statistics. #Ent./#Rel: number of entities/relations. #Train/#Valid/#Test: number of triples contained in the training/validation/testing set.
Hardware Specification	Yes	For FB15K-237, Prime KG, and WN18RR, experiments are conducted on a machine equipped with two AMD EPYC 7513 32-Core Processors, 528GB RAM, eight NVIDIA RTX A6000 GPUs, and CUDA 12.4 and the NVIDIA driver version 550.76. For YAGO3-10, due to its large size, experiments are conducted on a machine equipped with two AMD EPYC 7513 32-Core Processors, 528GB RAM, and eight NVIDIA A100 80GB PCIe GPUs.
Software Dependencies	Yes	For FB15K-237, Prime KG, and WN18RR, experiments are conducted on a machine equipped with two AMD EPYC 7513 32-Core Processors, 528GB RAM, eight NVIDIA RTX A6000 GPUs, and CUDA 12.4 and the NVIDIA driver version 550.76. For YAGO3-10... The system uses CUDA 12.2 and the NVIDIA driver version 535.129.03.
Experiment Setup	Yes	Table 11: Summary of hyperparameters we explored for both base models and KG-FIT. Table 12: Best hyperparameters grid-searched for base models on different datasets. Table 13: Hyperparameters we used for KG-FIT with different base models on different datasets.