reproducibilityindex.ai

TILP: Differentiable Learning of Temporal Logical Rules on Knowledge Graphs

Authors: Siheng Xiong, Yuan Yang, Faramarz Fekri, James Clayton Kerce

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare TILP with state-of-the-art methods on two benchmark datasets. We show that our proposed framework can improve upon the performance of baseline methods while providing interpretable results. Experiments on two benchmark datasets, i.e., WIKIDATA12k and YAGO11k, are conducted, where our framework shows comparable or improved performance relative to the state-of-the-art methods. The results of the experiments are shown in Table 1 with the efficiency study given in Appendix C. Ablation studies on the temporal feature modeling module (TILP w/o tfm) are also conducted.
Researcher Affiliation	Academia	Siheng Xiong, Yuan Yang, Faramarz Fekri & James Clayton Kerce Georgia Institute of Technology Atlanta, GA 30332, USA {sxiong45,yyang754}@gatech.edu, faramarz.fekri@ece.gatech.edu, clayton.kerce@gtri.gatech.edu
Pseudocode	No	The paper describes mathematical formulations and processes but does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	No	The paper does not provide any statement or link indicating that its source code is open or publicly available.
Open Datasets	Yes	We evaluate TILP on two standard t KG datasets, WIKIDATA12k and YAGO11k (Dasgupta et al. (2018)). WIKIDATA is a large knowledge base based on Wikipedia. To form the WIKIDATA12k dataset, a subgraph with temporal information is extracted by Dasgupta et al. (2018). YAGO is another large knowledge graph built from multilingual Wikipedias. Similarly, some temporally associated facts are distilled out from YAGO3 to form the YAGO11k dataset (Dasgupta et al. (2018)).
Dataset Splits	Yes	For WIKIDATA12k dataset, the start time range for training set, validation set and test set are [0, 2008], [2008, 2012], [2012, 2018], respectively. For YAGO11k dataset, the start time range for training set, validation set and test set are [−431, 2006], [2006, 2011], [2011, 2022], respectively.
Hardware Specification	No	The paper mentions 'on a 4-CPU machine' in Appendix C, but this is too general and does not provide specific hardware models (e.g., CPU model, GPU, memory) to be considered a reproducible hardware specification.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies used in the experiments.
Experiment Setup	Yes	For the link predication task on data of the form (es, r, eo, I), we generate a list of ranked candidates for both object prediction (es, r, ?, I) and subject prediction eo, r−1, ?, I. The maximum rule length is set to 5 for both datasets. The standard metrics, mean reciprocal rank (MRR), hit@1, hit@10 are used for comparison of the methods. Similar to Jain et al. (2020), we perform timeaware filtering which gives a more valid performance evaluation. The training is divided into two phases. In the first phase, the attention vectors for predicates, TRs and rule length are learned by maximizing the score of correct candidates. In the second phase, all the distribution parameters of temporal features are fitted with training samples. Then we train the parameters of weights for the temporal feature modeling module with frozen attention vectors, i.e, ϕT LR is used for prediction in the first stage, and ϕT ILP is adopted in the second stage.