Embedding Entities and Relations for Learning and Inference in Knowledge Bases
Authors: Bishan Yang, Scott Yih, Xiaodong He, Jianfeng Gao, and Li Deng
ICLR 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Under this framework, we compare a variety of embedding models on the link prediction task. We show that a simple bilinear formulation achieves new state-of-the-art results for the task (achieving a top-10 accuracy of 73.2% vs. 54.7% by Trans E on Freebase). |
| Researcher Affiliation | Collaboration | 1Department of Computer Science, Cornell University, Ithaca, NY, 14850, USA bishan@cs.cornell.edu 2Microsoft Research, Redmond, WA 98052, USA {scottyih,xiaohe,jfgao,deng}@microsoft.com |
| Pseudocode | Yes | Algorithm 1 EMBEDRULE |
| Open Source Code | No | The paper does not provide any explicit statement about making its source code available or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Datasets We used the Word Net (WN) and Freebase (FB15k) datasets introduced in (Bordes et al., 2013b). |
| Dataset Splits | Yes | We use the same training/validation/test split as in (Bordes et al., 2013b). |
| Hardware Specification | No | All the models were implemented in C# and using GPU. All methods are evaluated on a machine with a 64-bit processor, 2 CPUs and 8GB memory. This mentions general hardware types but lacks specific model numbers or detailed specifications like GPU model or exact CPU type. |
| Software Dependencies | No | The paper states 'All the models were implemented in C#' and refers to 'Ada Grad (Duchi et al., 2011)' for training, but does not provide specific version numbers for any software dependencies or libraries used. |
| Experiment Setup | Yes | For all models, we set the number of mini-batches to 10, the dimensionality of the entity vector d 100, the regularization parameter 0.0001, and the number of training epochs T 100 on FB15k and FB15k-401 and T 300 on WN (T was determined based on the learning curves where the performance of all models plateaued.) The learning rate was initially set to 0.1 and then adapted during training by Ada Grad. |