reproducibilityindex.ai

Probability Calibration for Knowledge Graph Embedding Models

Authors: Pedro Tabacof, Luca Costabello

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on three datasets with ground truth negatives show our contribution leads to well calibrated models when compared to the gold standard of using negatives. We get significantly better results than the uncalibrated models from all calibration methods.
Researcher Affiliation	Industry	Pedro Tabacof, Luca Costabello Accenture Labs Dublin, Ireland {pedro.tabacof, luca.costabello}@accenture.com
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code and experiments are available at https://github.com/Accenture/AmpliGraph.
Open Datasets	Yes	Datasets. We run experiments on triple classification datasets that include ground truth negatives (Table 1). ... WN11 (Socher et al., 2013). ... FB13 (Socher et al., 2013). ... YAGO39K (Lv et al., 2018). ... We also use two standard link prediction benchmark datasets, WN18RR (Dettmers et al., 2018) ... and FB15K-237 (Toutanova et al., 2015).
Dataset Splits	Yes	Datasets. We run experiments on triple classification datasets that include ground truth negatives (Table 1). We train on the training set, calibrate on the validation set, and evaluate on the test set. Table 1: (a) Triple classification datasets used in experiments (left); link prediction datasets used for positive base rate experiments (right); Training 112,581 Validation 5,218 Test 21,088 (for WN11)
Hardware Specification	Yes	All experiments were run under Ubuntu 16.04 on an Intel Xeon Gold 6142, 64 GB, equipped with a Tesla V100 16GB.
Software Dependencies	Yes	The knowledge graph embedding models are implemented with the AmpliGraph library (Costabello et al., 2019) version 1.1, using TensorFlow 1.13 (Abadi et al., 2016) and Python 3.6 on the backend.
Experiment Setup	Yes	We rely on typical hyperparameter values: we train the embeddings with dimensionality k = 100, Adam optimizer, initial learning rate α0 = 1e-4, negatives per positive ratio η = 20, epochs = 1000. We train all models on four different loss functions: Self-adversarial (Sun et al., 2019), pairwise (Bordes et al., 2013), NLL, and Multiclass-NLL (Lacroix et al., 2018).