Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Embedding of Hierarchically Typed Knowledge Bases
Authors: Richong Zhang, Fanshuang Kong, Chenyue Wang, Yongyi Mao
AAAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments We conduct experiments to evaluate the proposed typed models (Trans E-T and Trans H-T on binary data and m Trans H-T on multi-fold data) and their corresponding typeless models. |
| Researcher Affiliation | Academia | Richong Zhang,1 Fanshuang Kong,1 Chenyue Wang,1 Yongyi Mao2 1BDBC and SKLSDE, School of Computer Science and Engineering, Beihang University 2School of Electrical Engineering and Computer Science, University of Ottawa |
| Pseudocode | No | The paper describes the proposed scheme and models using mathematical equations and textual explanations but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | More implementation details of the model can be found in the code on GITHUB2. 2https://github.com/kongfansh/Embedding of Hierarchically Typed KB |
| Open Datasets | Yes | Three datasets FB15K, FB15K*, and JF17K are used in the experiments. Both FB15K (Bordes et al. 2013) and JF17K (Wen et al. 2016) contain ๏ฌltered data obtained from Freebase. |
| Dataset Splits | No | Table 1 lists "#instances(total/train/test)" for each dataset, indicating the data is split into training and testing sets, but a separate validation set split is not explicitly mentioned with quantities or percentages within the paper. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as CPU or GPU models, memory, or cloud infrastructure specifications. |
| Software Dependencies | No | The paper mentions "SGD" (stochastic gradient descent) and refers to code on "GITHUB" but does not specify any software dependencies with version numbers, such as programming languages, libraries, or frameworks. |
| Experiment Setup | Yes | In all the models, SGD initializes all entity embedding vectors and all br vectors to random unit-length vectors. For all typed models, each ac vector is also initialized to random unit-length vectors, and each dc is initialized to 1. For m Trans H, each ar(ฯ) is randomly initialized to a value in the interval (0, 1). In SGD, each mini-batch consists of 1000 packets. Each epoch loops over M/1000 batches, where M is the number of instances in the training set. For each model, SGD runs for 1000 epochs. The learning rate in the updates based on the base-cost gradients is set to 0.001, and the learning rate in the updates based on the type-cost gradient is set to 0.0003. |