Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Open-World Knowledge Graph Completion
Authors: Baoxu Shi, Tim Weninger
AAAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on large data sets, both old and new, show that Con Mask performs well in the open-world KGC task and even outperforms existing KGC models on the standard closed-world KGC task. |
| Researcher Affiliation | Academia | Baoxu Shi, Tim Weninger University of Notre Dame EMAIL |
| Pseudocode | No | The paper describes the model components and their interactions but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Con Mask is implemented in Tensor Flow. The source code is available at https://github.com/bxshi/Con Mask. |
| Open Datasets | Yes | The Freebase 15K (FB15k) data set is widely used in KGC... we introduce two new data sets DBPedia50k and DBPedia500k for both open-world and closed-world KGC tasks. Statistics of all data sets are shown in Tab. 2. Also, 'we also released two new DBPedia data sets for KGC research and development.' |
| Dataset Splits | Yes | Statistics of all data sets are shown in Tab. 2. The methodology used to evaluate the open-world and closed-world KGC tasks is similar to the related work. Specifically, we randomly selected 90% of the entities in the KG and induced a KG subgraph using the selected entities, and from this reduced KG, we further removed 10% of the relationships, i.e., graph-edges, to create KGtrain. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used for running its experiments. |
| Software Dependencies | No | The paper states 'Con Mask is implemented in Tensor Flow' but does not provide specific version numbers for TensorFlow or other software dependencies. |
| Experiment Setup | Yes | Training parameters were set empirically but without finetuning. We set the word embedding size k = 200, maximum entity content and name length kc = kn = 512. The word embeddings are from the publicly available pre-trained 200-dimensional Glo Ve embeddings (Pennington, Socher, and Manning 2014). The content masking window size km = 6, number of FCN layers kfcn = 3 where each layer has 2 convolutional layers and a BN layer with a moving average decay of 0.9 followed by a dropout with a keep probability p = 0.5. Max-pooling in each FCN layer has a pool size and stride size of 2. The mini-batch size used by Con Mask is kb = 200. We use Adam as the optimizer with a learning rate of 10-2. The target sampling set sizes for |E+| and |E | are 1 and 4 respectively. All open-world KGC models were run for at most 200 epochs. |