Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Complex Embeddings for Simple Link Prediction

Authors: Théo Trouillon, Johannes Welbl, Sebastian Riedel, Eric Gaussier, Guillaume Bouchard

ICML 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In order to evaluate our proposal, we conducted experiments on both synthetic and real datasets. The synthetic dataset is based on relations that are either symmetric or antisymmetric, whereas the real datasets comprise different types of relations found in different, standard KBs. We refer to our model as Compl Ex, for Complex Embeddings.
Researcher Affiliation	Collaboration	1 Xerox Research Centre Europe, 6 chemin de Maupertuis, 38240 Meylan, FRANCE 2 Universit e Grenoble Alpes, 621 avenue Centrale, 38400 Saint Martin d H eres, FRANCE 3 University College London, Gower St, London WC1E 6BT, UNITED KINGDOM
Pseudocode	No	The main body of the paper does not contain pseudocode or a clearly labeled algorithm block. While Appendix A is mentioned, it is not included in the provided text.
Open Source Code	No	Code is currently under clearance review and will be available at: https://github.com/ttrouill/complex
Open Datasets	Yes	We next evaluate the performance of our model on the FB15K and WN18 datasets. FB15K is a subset of Freebase, a curated KB of general facts, whereas WN18 is a subset of Wordnet, a database featuring lexical relations between words. We use original training, validation and test set splits as provided by Bordes et al. (2013b).
Dataset Splits	Yes	Table 3. Number of entities, relations, and observed triples in each split for the FB15K and WN18 datasets. Dataset \|E\| \|R\| #triples in Train/Valid/Test WN18 40,943 18 141,442 / 5,000 / 5,000 FB15K 14,951 1,345 483,142 / 50,000 / 59,071
Hardware Specification	No	The paper mentions training models but does not provide specific hardware details such as GPU/CPU models or memory specifications.
Software Dependencies	No	The paper mentions using "theano (Bergstra et al., 2010)" but does not specify a version number for this or any other software dependency.
Experiment Setup	Yes	Reported results are given for the best set of hyper-parameters evaluated on the validation set for each model, after grid search on the following values: K {10, 20, 50, 100, 150, 200}, λ {0.1, 0.03, 0.01, 0.003, 0.001, 0.0003, 0.0}, α0 {1.0, 0.5, 0.2, 0.1, 0.05, 0.02, 0.01}, η {1, 2, 5, 10} with λ the L2 regularization parameter, α0 the initial learning rate (then tuned at runtime with Ada Grad), and η the number of negatives generated per positive training triple. We also tried varying the batch size but this had no impact and we settled with 100 batches per epoch.