Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

LBMKGC: Large Model-Driven Balanced Multimodal Knowledge Graph Completion

Authors: Yuan Guo, Qian Ma, Hui Li, Qiao Ning, Furui Zhan, Yu Gu, Ge Yu, Shikai Guo

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we evaluate our method using a mainstream task in KGC link prediction task. We first introduce the experimental setup and then present the results. We mainly explore the following four research questions (RQs) regarding LBMKGC: RQ1: Can our model LBMKGC surpass the existing baselines and make substantial progress in the MMKGC task in link prediction? RQ2: How much does the design of each module in LBMKGC contribute to performance? RQ3: When the multimodal information is imbalanced, can LBMKGC maintain stable performance in the MMKGC task? RQ4: Can LBMKGC adaptively adjust multimodal weights based on different entities? 4.1 Experimental Settings Datasets. To better explore the MMKGC task in a diverse and complex environment, we conducted comprehensive experiments and further explorations on three public benchmarks. The DB15K [14] dataset was built by crawling search engine images and aligning them with DBpedia [12], while the MKG-W and MKG-Y [32] datasets were created by extracting subsets from Wikidata [26] and YAGO [22]. We used the pre-trained model of CLIP to extract multimodal features. Baselines. To demonstrate the effectiveness of our method, we conducted a comprehensive comparison and analysis with 21 different state-of-the-art KGC and MMKGC models as our baselines. Metrics. To evaluate our method, we conducted link prediction tasks on three datasets. Link prediction is an important task in knowledge graph completion, aiming to predict the missing entity for a given query (h, r, ?) or (?, r, t). 4.2 Main Results (RQ1) We conducted link prediction experiments and presented the experimental results in Table 1. 4.3 Ablation Study (RQ2) To provide a more detailed demonstration of the effectiveness of each module design and to answer RQ3, we conducted ablation studies, as shown in Table 2. 4.4 Modality-missing Results (RQ3) We conducted link prediction experiments with modality dropout on the DB15K dataset. 4.5 Case Study (RQ4) To intuitively demonstrate the effectiveness of CGu AF, we present the histogram of visual modality weight distribution for all entities in the MKG-W dataset in Figure 4 (a).
Researcher Affiliation	Academia	Yuan Guo Dalian Maritime University EMAIL Qian Ma Dalian Maritime University EMAIL Hui Li Dalian Maritime University EMAIL Qiao Ning Jiangnan University EMAIL Furui Zhan Dalian Maritime University EMAIL Yu Gu Northeastern University, China EMAIL Ge Yu Northeastern University, China EMAIL Shikai Guo Dalian Maritime University EMAIL
Pseudocode	No	The paper includes mathematical formulas describing the model components and their operations (e.g., Equation 1-13) but does not present these in a structured pseudocode or algorithm block.
Open Source Code	Yes	Our code and data are publicly available at: https://github.com/guoynow/LBMKGC.
Open Datasets	Yes	Datasets. To better explore the MMKGC task in a diverse and complex environment, we conducted comprehensive experiments and further explorations on three public benchmarks. The DB15K [14] dataset was built by crawling search engine images and aligning them with DBpedia [12], while the MKG-W and MKG-Y [32] datasets were created by extracting subsets from Wikidata [26] and YAGO [22].
Dataset Splits	No	To ensure fair comparisons, the filter setting [1] is applied to all results to remove candidate triples that already exist in the training set. We utilized three open-source benchmark datasets and conducted training/testing splits and other initialization tasks in a manner consistent with other papers (such as My GO [35], Ada MF-MAT [36], MMRNS [32]) to ensure the validity of our experiments.
Hardware Specification	Yes	We implemented our LBMKGC model based on the famous open-source KGC library Open KE. We conducted experiments on a Linux server with Ubuntu 24.04.01 operating system and a single NVIDIA Ge Force 4090 GPU.
Software Dependencies	No	We implemented our LBMKGC model based on the famous open-source KGC library Open KE. We conducted experiments on a Linux server with Ubuntu 24.04.01 operating system and a single NVIDIA Ge Force 4090 GPU. We reproduced some advanced models, and some baseline results refer to My GO [35]. In the LBMKGC, we fix the batch size to 512 and set the training epoch from {1000, 1250, 1500}. The embedding dimensions are tuned from {300, 400, 500} and the negative sampling number K is tuned from {32, 64, 128}. The margin γ is tuned from {8, 12, 16, 20, 24} and the temperature β is set to 4. We optimize the model with Adam and the learning rate is tuned from {1e-5, 2e-5, 5e-5}. For baselines, we reproduce the results following the methodology and parameter setting described in the original papers and their open-source official code.
Experiment Setup	Yes	In the LBMKGC, we fix the batch size to 512 and set the training epoch from {1000, 1250, 1500}. The embedding dimensions are tuned from {300, 400, 500} and the negative sampling number K is tuned from {32, 64, 128}. The margin γ is tuned from {8, 12, 16, 20, 24} and the temperature β is set to 4. We optimize the model with Adam and the learning rate is tuned from {1e-5, 2e-5, 5e-5}.