Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Systematic Relational Reasoning With Epistemic Graph Neural Networks

Authors: Irtaza Khalid, Steven Schockaert

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that Epi GNNs achieve state-of-the-art results on link prediction tasks that require systematic reasoning. Furthermore, for inductive knowledge graph completion, Epi GNNs rival the performance of state-of-the-art specialized approaches. Finally, we introduce two new benchmarks that go beyond standard relational reasoning by requiring the aggregation of information from multiple paths. Here, existing neuro-symbolic approaches fail, yet Epi GNNs learn to reason accurately.
Researcher Affiliation Academia Irtaza Khalid & Steven Schockaert Cardiff University, UK EMAIL
Pseudocode No The paper describes the proposed Epi GNN model and related algorithms conceptually using mathematical formulas and textual descriptions, but it does not include a distinct, structured pseudocode block or algorithm listing.
Open Source Code Yes Code and datasets are available at https://github.com/erg0dic/gnn-sg.
Open Datasets Yes Code and datasets are available at https://github.com/erg0dic/gnn-sg. ... We introduce two new benchmarks: one based on RCC-8 and one based in IA. ... We release these benchmarks under a CC-BY 4.0 license.
Dataset Splits Yes For CLUTRR, RCC-8 and IA, to test for systematic generalization, models are trained on small graphs and subsequently evaluated on larger graphs. ... We use a standard 80-20 split for training and validation for CLUTRR and RCC-8. For Graphlog, we use the validation set that is provided separately from the test set. ... In inductive KGC, models are evaluated on a test graph which is disjoint from the training graph.
Hardware Specification Yes All experiments were conducted using RTX 4090 and V100 NVIDIA GPUs.
Software Dependencies No The paper mentions "We use the Adam optimizer (Kingma & Ba, 2017)" but does not specify version numbers for any programming languages, libraries, or frameworks used for implementation.
Experiment Setup Yes The number of layers of the Epi GNN model is fixed to 9 and the number of negative examples per instance is fixed as 1. The other hyperparameters of the Epi GNN model are tuned using grid search. The optimal values that were obtained are mentioned in Table 11. ... We conduct the following hyperparameter sweeps: learning rate in {0.00001, 0.001, 0.01, 0.1}, batch size in {16, 32, 64, 128}, number of facets m in {1, 2, 4, 8, 16, 32} and embedding dimension size in {8, 16, 32, 64, 128, 256}. We also tune the margin in the loss function over {10, 1.1, 1.0, 0.9, . . . , 0.1, 0.01}.