Probabilistic Entity Representation Model for Reasoning over Knowledge Graphs

Authors: Nurendra Choudhary, Nikhil Rao, Sumeet Katariya, Karthik Subbian, Chandan Reddy

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On the logical query reasoning problem, we demonstrate that the proposed PERM significantly outperforms the state-of-the-art methods on various public benchmark KG datasets on standard evaluation metrics. We also evaluate PERM’s competence on a COVID-19 drug repurposing case study and show that our proposed work is able to recommend drugs with substantially better F1 than current methods.
Researcher Affiliation Collaboration Nurendra Choudhary1, Nikhil Rao2, Sumeet Katariya2, Karthik Subbian2, Chandan K. Reddy1,2 1Department of Computer Science, Virginia Tech, Arlington, VA 2Amazon, Palo Alto, CA
Pseudocode No The paper describes the model and its operations mathematically and textually, but it does not include any pseudocode or algorithm blocks.
Open Source Code Yes 1Implementation code: https://github.com/Akirato/PERM-Gaussian KG
Open Datasets Yes We utilize the following standard benchmark datasets to compare PERM’s performance on the task of reasoning over KGs: FB15K-237 [24], NELL995 [25], DBPedia2, DRKG [26]. More detailed statistics of these datasets are provided in Table 1.
Dataset Splits Yes Table 1: Dataset statistics including the number of unique entities, relations, and edges, along with the splits of dataset triples used in the experiments. Dataset: FB15k-237, # Training: 272,115, # Validation: 17,526, # Test: 20,438. Dataset: NELL995, # Training: 114,213, # Validation: 14,324, # Test: 14,267. Dataset: DBPedia, # Training: 168,659, # Validation: 24,095, # Test: 48,188. Dataset: DRKG, # Training: 4,111,989, # Validation: 587,428, # Test: 1,174,854.
Hardware Specification Yes All our models are implemented in Pytorch [23] and run on four Quadro RTX 8000.
Software Dependencies No The paper mentions 'Pytorch' but does not specify a version number for it or any other ancillary software dependency.
Experiment Setup No The paper describes some aspects of the experimental setup, such as the use of self-attention mechanism and linear solver, and mentions training on Gtrain with validation on Gvalid, but it does not provide concrete hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) in the main text.