Faithful Rule Extraction for Differentiable Rule Learning Models

Authors: Xiaxia Wang, David Jaime Tena Cucala, Bernardo Cuenca Grau, Ian Horrocks

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In Section 6, we conduct a comprehensive evaluation on KG completion tasks. Amongst other findings, our experiments show that SMDRUM and MMDRUM obtain competitive performance and confirm the practical feasibility of the rule extraction algorithms proposed in Section 5.
Researcher Affiliation Academia Xiaxia Wang, David J. Tena Cucala, Bernardo Cuenca Grau, Ian Horrocks Department of Computer Science, University of Oxford, UK
Pseudocode Yes Algorithm 1 outlines the procedure for extracting a faithful program from a DRUM model. Algorithm 2: Multipath Rule Extraction for a Fixed Dataset. Algorithm 3: SMDRUM Rule Extraction. Algorithm 4: MMDRUM Rule Extraction.
Open Source Code Yes The datasets and source codes used in our experiments are available from the Git Hub repository with documentation at https://github.com/xiaxia-wang/Faithful RE.
Open Datasets Yes We used the 13 benchmark datasets (Appendix C) for inductive KG completion by Teru et al. (2020) based on FB15k-237 (Toutanova & Chen, 2015), NELL-995 (Xiong et al., 2017), WN18RR Dettmers et al. (2018), and Family (Kok & Domingos, 2007)
Dataset Splits Yes We used the 13 benchmark datasets (Appendix C) for inductive KG completion by Teru et al. (2020) based on FB15k-237 (Toutanova & Chen, 2015), NELL-995 (Xiong et al., 2017), WN18RR Dettmers et al. (2018), and Family (Kok & Domingos, 2007), preserving the splits for train, validation and test.
Hardware Specification Yes All experiments were conducted on a Linux workstation with a Xeon E5-2670 CPU.
Software Dependencies Yes we re-implemented the models with Python 3.8 and Py Torch 2.0.1.
Experiment Setup Yes We followed Sadeghian et al. (2019) for all the default settings in training, such as the log-likelihood loss function, Adam optimizer and the use of 10 maximum training epochs. An early stop strategy was adopted for each model based on the prediction loss of the validation set. We selected L = 2 for all models and rank N = 3 for all DRUM-based models. Each model was trained up to 10 epochs. Threshold β (0, 1) for each model is a hyperparameter. We tried several values and picked the one maximising F1-score on validation sets.