Reasoning Like Human: Hierarchical Reinforcement Learning for Knowledge Graph Reasoning

Authors: Guojia Wan, Shirui Pan, Chen Gong, Chuan Zhou, Gholamreza Haffari

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that our proposed model achieves substantial improvements in ambiguous relation tasks.
Researcher Affiliation Academia Guojia Wan1 , Shirui Pan2 , Chen Gong3,4 , Chuan Zhou5 , Gholamreza Haffari2 1School of Computer Science, Institute of Artificial Intelligence, and National Engineering Research Center for Multimedia Software, Wuhan University, China 2Faculty of Information Technology, Monash University, Australia 3School of Computer Science and Engineering, Nanjing University of Science and Technology, China 4Department of Computing, Hong Kong Polytechnic University, Hong Kong, China 5Academy of Mathematics and Systems Science, Chinese Academy of Sciences, China guojiawan@whu.edu.cn, shirui.pan@monash.edu, chen.gong@njust.edu.cn, zhouchuan@amss.ac.cn, gholamreza.haffari@monash.edu
Pseudocode Yes Algorithm 1 Training Procedure
Open Source Code No The paper does not explicitly state that the source code for the described methodology is publicly available, nor does it provide a direct link to such code.
Open Datasets Yes We conducted experiments on three datasets: 1) NELL995 released by [Xiong et al., 2017]. 2) FB15K-237, a subset of FB15K where inverse relations are removed is a knowledge base where all entities are present in Wikilinks database. 3) WN18RR is a subset of Wordnet, which provides semantic knowledge of words.
Dataset Splits No The paper mentions optimizing parameters in the 'valid set' but does not provide specific details about the training/validation/test dataset splits (e.g., percentages or counts) or how they were defined for reproduction.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions using the ADAM optimizer but does not provide specific version numbers for any ancillary software dependencies (e.g., libraries, frameworks, or programming languages) required for replication.
Experiment Setup Yes The maximum length of the high-level policy l H is fixed to 4. The maximum length of the low-level policy l L is fixed to 2. The reward factor γ is 1.2, and the batch size κ is 100. the vector dimension d is 100. The clustering number for i-th relation cluster is 2i 1. The regularization λ is 0.005.