Text-Enhanced Representation Learning for Knowledge Graph

Authors: Zhigang Wang, Juanzi Li

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on multiple benchmark datasets show that our proposed method successfully addresses the above issues and significantly outperforms the state-of-the-art methods.
Researcher Affiliation Academia Zhigang Wang and Juanzi Li Tsinghua University, Beijing, CHINA wangzg14@mails.tsinghua.edu.cn lijuanzi@tsinghua.edu.cn
Pseudocode No The paper describes the overall framework and steps involved in the proposed method, but it does not provide any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement or link indicating that the source code for the described methodology is open-source or publicly available.
Open Datasets Yes For the knowledge graphs to be represented, we employ several datasets commonly used in previous methods, which are generated from Word Net [Miller, 1995] and Freebase [Bollacker et al., 2008]. Following [Bordes et al., 2013; Wang et al., 2014b; Lin et al., 2015b; Socher et al., 2013], we adopt four benchmark datasets for evaluation, which are WN18 and WN11 generated from Word Net, FB15K and FB13 generated from Freebase.
Dataset Splits Yes The detailed statistics of the datasets are shown in Table 1. Dataset #R #E #Triples(Train/Valid/Test) WN18 18 40,943 141,442 5,000 5,000 FB15K 1,345 14,951 483,142 50,000 59,071 WN11 11 38,696 112,581 2,609 10,544 FB13 13 75,043 316,232 5,908 23,733
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory).
Software Dependencies No The paper mentions training a 'word2vec model' and using 'stochastic gradient descent (SGD)', but it does not specify any software dependencies with version numbers.
Experiment Setup Yes We set the neighboring threshold on the co-occurrence network to be 10, and select learning rate λ for SGD among {0.1, 0.01, 0.001}, the margin γ among {1, 2, 4}, the embedding dimension k among {20, 50, 100}, the batch size B among {120, 1440, 4800}. The best configuration is determined according to the mean rank in validation set. We traverse all the training triples for 1,000 times.