Adaptive Path-Memory Network for Temporal Knowledge Graph Reasoning

Authors: Hao Dong, Zhiyuan Ning, Pengyang Wang, Ziyue Qiao, Pengfei Wang, Yuanchun Zhou, Yanjie Fu

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments conducted on four real-world TKG datasets demonstrate that our proposed model obtains substantial performance improvement and outperforms the state-of-the-art up to 4.8% absolute in MRR.
Researcher Affiliation Academia 1Computer Network Information Center, Chinese Academy of Sciences, Beijing 2University of Chinese Academy of Sciences, Beijing 3State Key Laboratory of Internet of Things for Smart City, University of Macau, Macau 4The Hong Kong University of Science and Technology (Guangzhou), Guangzhou 5Department of Computer Science, University of Central Florida, Orlando
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Codes and datasets are all available at https://github.com/hhdo/Dae Mon.
Open Datasets Yes Extensive experiments are conducted on four typical TKG datasets, namely, ICEWS18 [Jin et al., 2020], GDELT [Leetaru and Schrodt, 2013], WIKI [Leblay and Chekol, 2018], and YAGO [Mahdisoltani et al., 2014].
Dataset Splits Yes We also adopt the same strategy of dataset split as introduced in [Jin et al., 2020] and split the dataset into train/valid/test by timestamps that (timestamps of the train) < (timestamps of the valid) < (timestamps of the test).
Hardware Specification Yes All experiments are conducted with EPYC 7742 CPU, and 8 TESLA A100 GPUs.
Software Dependencies No The paper mentions 'Adam [Kingma and Ba, 2014]' as the optimizer but does not specify version numbers for any software dependencies like Python, PyTorch, CUDA, etc.
Experiment Setup Yes For the Memory and PAU, the embedding dimension d is set to 64; the number of path aggregation layers w is set to 2; the activation function of aggregation is relu. Layer normalization and shortcut are conducted on the aggregation layers. For the Memory Passing, we perform grid search on the lengths of historical subgraph...For the parameter learning, negative sample number is set to 64; the hyperparameter α in regularization term is set to 1. Adam [Kingma and Ba, 2014] is adopted for parameter learning with the learning rate of 5e 4, and the max epoch of training is set to 30.