Hierarchical Relational Learning for Few-Shot Knowledge Graph Completion
Authors: Han Wu, Jie Yin, Bala Rajaratnam, Jianyuan Guo
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on benchmark datasets validate the superiority of Hi Re over state-of-the-art methods. |
| Researcher Affiliation | Academia | Han Wu1, Jie Yin1, Bala Rajaratnam1,2 & Jianyuan Guo1 1The University of Sydney, 2University of California, Davis {han.wu,jie.yin,bala.rajaratnam,jguo5172}@sydney.edu.au |
| Pseudocode | Yes | Algorithm 1: MAML based training framework of Hi Re. |
| Open Source Code | Yes | The code can be found in https://github.com/alexhw15/Hi Re.git. |
| Open Datasets | Yes | We conduct experiments on two widely used few-shot KG completion datasets, Nell-One and Wiki-One, which are constructed by (Xiong et al., 2018). |
| Dataset Splits | Yes | We use 51/5/11 and 133/16/34 tasks for training/validation/testing on Nell-One and Wiki-One, respectively, following the common setting in the literature. |
| Hardware Specification | Yes | All models are implemented by Py Torch and trained on 1 Tesla P100 GPU. |
| Software Dependencies | No | The paper mentions 'implemented by Py Torch' but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For fair comparison, we use the entity and relation embeddings pretrained by Trans E (Bordes et al., 2013) on both datasets, released by GMatching (Xiong et al., 2018), for the initialization of our proposed Hi Re. Following the literature, the embedding dimension is set to 100 and 50 for Nell One and Wiki-One, respectively. On both datasets, we set the number of SAB to 1 and each SAB contains one self-attention head. We apply drop path to avoid overfitting with a drop rate of 0.2. The maximum number of neighbors for a given entity is set to 50, the same as in prior works. For all experiments except for the sensitivity test on the trade-off parameter λ in Eq. 21, λ is set to 0.05 and the number of false contexts for each reference triplet is set to 1. The margin γ in Eq. 12 is set to 1. We apply mini-batch gradient descent to train the model with a batch size of 1, 024 for both datasets. Adam optimizer is used with a learning rate of 0.001. We evaluate Hi Re on validation set every 1, 000 steps and choose the best model within 30, 000 steps based on MRR. |