Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Neural Methods for Logical Reasoning over Knowledge Graphs
Authors: Alfonso Amayuelas, Shuai Zhang, Xi Susie Rao, Ce Zhang
ICLR 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate experimentally the performance of our model through extensive experimentation on well-known benchmarking datasets. |
| Researcher Affiliation | Academia | EPFL ETH Zurich EMAIL EMAIL |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Source code available on: https://github.com/amayuelas/NNKGReasoning |
| Open Datasets | Yes | We perform experiments on three standard datasets in KG benchmarks. These are the same datasets used in Query2Box (Ren et al., 2020) and Beta E (Ren & Leskovec, 2020): FB15k (Bordes et al., 2013), FB15k-237 (Toutanova et al., 2015) and NELL995 (Xiong et al., 2017b). |
| Dataset Splits | Yes | In the experiments, we use the standard evaluation scheme for Knowledge Graphs, where edges are split into training, test and validation sets. ... we effectively create 3 graphs: G train for training; G valid, which contains G train plus the validation edges; and G test which contains G valid and the test edges. |
| Hardware Specification | Yes | All experiments have been computed on independent processes on NVIDIA GPUs, either the Ge Force GTX Titan X Pascal (12 GB) or the Tesla T4 (16 GB). |
| Software Dependencies | No | The paper states 'Our code is implemented using Py Torch.' but does not provide specific version numbers for PyTorch or any other software dependencies, such as Python or CUDA. |
| Experiment Setup | Yes | All our models and GQE use the following parameters: Embed dim = 800, learning rate = 0.0001, negative sample size = 128, batch size = 512, margin = 24, num. iterations = 300,000/450,000. Q2B and Beta E differ from the previous configuration in Embed. dim = 400 and margin = 30/60. |