Image-embodied Knowledge Representation Learning
Authors: Ruobing Xie, Zhiyuan Liu, Huanbo Luan, Maosong Sun
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our IKRL models on knowledge graph completion and triple classification. Experimental results demonstrate that our models outperform all baselines on both tasks, which indicates the significance of visual information for knowledge representations and the capability of our models in learning knowledge representations with images. |
| Researcher Affiliation | Academia | 1 Department of Computer Science and Technology, State Key Lab on Intelligent Technology and Systems, National Lab for Information Science and Technology, Tsinghua University, China 2 Jiangsu Collaborative Innovation Center for Language Ability, Jiangsu Normal University, China |
| Pseudocode | No | The paper describes its model architecture and optimization steps but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code and dataset of this paper can be obtained from https://github.com/thunlp/IKRL. |
| Open Datasets | Yes | In this paper, we construct a new dataset of knowledge graph combined with images named WN9-IMG for evaluation tasks including knowledge graph completion and triple classification. The triple part of WN9-IMG is the subset of a classical KG dataset WN18 [Bordes et al., 2014], which is originally extracted from Word Net [Miller, 1995]. For the consideration of image quality, we use 63,225 images extracted from Image Net [Deng et al., 2009]. |
| Dataset Splits | Yes | We assure that all entities in WN9-IMG should have images, and randomly split extracted triples into train, validation and test set. The statistics of WN9-IMG are listed in Table 1. Table 1: Dataset #Rel #Ent #Train #Valid #Test WN9-IMG 9 6,555 11,741 1,337 1,319 |
| Hardware Specification | No | The paper mentions using "GPU to accelerate image representation" but does not specify any particular GPU model (e.g., NVIDIA A100, RTX series) or other hardware components like CPU or memory details. |
| Software Dependencies | No | The paper mentions software like "Caffe [Jia et al., 2014]" and that it uses "Alex Net" which is "pre-trained on ILSVRC 2012", but it does not provide specific version numbers for Caffe or any other libraries/frameworks used. |
| Experiment Setup | Yes | We train the IKRL model via mini-batch SGD, with the margin γ set among {1.0, 2.0, 4.0}. The learning rate λ could be either empirically fixed among {0.0002, 0.0005, 0.001}, or designed following a flexible adaptive strategy that descends through iterations. The optimal configurations of the IKRL model are: γ = 4.0, with the learning rate defined adopting a linear-declined strategy in which λ ranges form 0.001 to 0.0002. To balance efficiency and diversity, the image number n for all entity is up to 10. We also set the dimension of image feature embeddings di = 4096, and the dimension of entity and relation embeddings ds = 50. |