Embedding Entities and Relations for Learning and Inference in Knowledge Bases

Authors: Bishan Yang, Scott Yih, Xiaodong He, Jianfeng Gao, and Li Deng

ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Under this framework, we compare a variety of embedding models on the link prediction task. We show that a simple bilinear formulation achieves new state-of-the-art results for the task (achieving a top-10 accuracy of 73.2% vs. 54.7% by Trans E on Freebase).
Researcher Affiliation Collaboration 1Department of Computer Science, Cornell University, Ithaca, NY, 14850, USA bishan@cs.cornell.edu 2Microsoft Research, Redmond, WA 98052, USA {scottyih,xiaohe,jfgao,deng}@microsoft.com
Pseudocode Yes Algorithm 1 EMBEDRULE
Open Source Code No The paper does not provide any explicit statement about making its source code available or a link to a code repository for the methodology described.
Open Datasets Yes Datasets We used the Word Net (WN) and Freebase (FB15k) datasets introduced in (Bordes et al., 2013b).
Dataset Splits Yes We use the same training/validation/test split as in (Bordes et al., 2013b).
Hardware Specification No All the models were implemented in C# and using GPU. All methods are evaluated on a machine with a 64-bit processor, 2 CPUs and 8GB memory. This mentions general hardware types but lacks specific model numbers or detailed specifications like GPU model or exact CPU type.
Software Dependencies No The paper states 'All the models were implemented in C#' and refers to 'Ada Grad (Duchi et al., 2011)' for training, but does not provide specific version numbers for any software dependencies or libraries used.
Experiment Setup Yes For all models, we set the number of mini-batches to 10, the dimensionality of the entity vector d 100, the regularization parameter 0.0001, and the number of training epochs T 100 on FB15k and FB15k-401 and T 300 on WN (T was determined based on the learning curves where the performance of all models plateaued.) The learning rate was initially set to 0.1 and then adapted during training by Ada Grad.