Knowledge Graphs Can be Learned with Just Intersection Features
Authors: Duy Le, Shaochen Zhong, Zirui Liu, Shuai Xu, Vipin Chaudhary, Kaixiong Zhou, Zhaozhuo Xu
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental results demonstrate that a straightforward fully-connected network leveraging these intersection features can surpass the performance of established KG embedding models and even outperform graph neural network baselines. Additionally, we highlight the substantial training time efficiency gains achieved by our network trained on intersection features. |
| Researcher Affiliation | Academia | 1Department of Computer and Data Sciences, Case Western Reserve University 2Department of Computer Science, Rice University 3Department of Electrical and Computer Engineering, North Carolina State University 4Department of Computer Science, Stevens Institute of Technology. |
| Pseudocode | No | The paper describes algorithms and estimation steps (e.g., using equations and detailed explanations) but does not present them in a formally labeled "Pseudocode" or "Algorithm" block or figure. |
| Open Source Code | Yes | For hyperparameters used in our approach, we refer the readers to our implementation https://github.com/Escanord/Intersection_Features for details. |
| Open Datasets | Yes | We first introduce knowledge graph completion datasets used in the experiments. All datasets are publicly available and widely used. Their statistics are shown in Table 4 in the supplementary materials. NELL-995 (Xiong et al., 2017) contains triples derived from the NELL system to benchmark link prediction for multi-hop entity pairs. WN18RR (Dettmers et al., 2018) is a link prediction dataset derived from Word Net, a large knowledge graph of semantic relations between words. YAGO3-10 (Mahdisoltani et al., 2015) is the largest knowledge graph completion dataset used in our experiments with more than one million triples taken from Wikipedia. We also include FB15K237 (Toutanova & Chen, 2015) and FB15K (Bordes et al., 2013) are small subsets of knowledge base relation triples in Freebase, a heterogeneous and well-known knowledge graph. |
| Dataset Splits | Yes | We train the models on the train graph. Next, given the train graph and a corrupted triplet from the test graph (h, r, ?), we would like to predict an ? from entities within the set of KG s entities. We follow widely adopted evaluation metrics on link prediction task (Bordes et al., 2013). For each positive triple (h, r, t) in the test set, where h & t represent head and tail entities and r is the relation, we corrupt it by replacing the head and tail entity with every other entity in the dataset to obtain invalid triples (h , r, t) and (h, r, t ), respectively. Each method attempts to rank each triple on how likely it is a valid one in the first place. We measure filtered MRR, mean reciprocal rank, and filtered Hit@k, how many positive triples are correctly ranked in the top k, given corrupted but positive triples are not considered. The higher the metrics values are, the better the method performs. |
| Hardware Specification | Yes | All model training and evaluations were conducted on a single NVIDIA A100 GPU with 80G memory. |
| Software Dependencies | No | The paper mentions using "Open KE (Han et al., 2018)" as the primary framework but does not specify its version or the versions of other software libraries/dependencies (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | Particularly for our method, the intersection features are within [1, 2, 3]-hop of each node, learning rate λ is chosen among [1, 0.1, 0.05, 0.01, 0.005, 0.001], optimizer is chosen from {Adam, Adagrad, Adadelta, SGD}. Our model is trained at most 4,000 epochs among all datasets. There are 128 Min Hash functions used to estimate 3-way Jaccard similarity of a triple. We note that the more Min Hash functions we have, the better in estimation. However, due to the robustness of DNNs, we do not require significantly better estimation. Similarly, we set Hyper Log Log s size, i.e. p, to be 8. |