Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Visual Pivoting for (Unsupervised) Entity Alignment
Authors: Fangyu Liu, Muhao Chen, Dan Roth, Nigel Collier4257-4266
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on benchmark data sets DBP15k and DWY15k show that EVA offers state-of-the-art performance on both monolingual and cross-lingual entity alignment tasks. Furthermore, we discover that images are particularly useful to align long-tail KG entities, which inherently lack the structural contexts that are necessary for capturing the correspondences. Code release: https://github.com/cambridgeltl/eva; project page: http://cogcomp.org/page/publication view/927. In this section, we conduct experiments on two benchmark data sets ( 4.1), under both semiand unsupervised settings ( 4.2). We also provide detailed ablation studies on different model components ( 4.3), and study the impact of incorporating visual representations on long-tail entities ( 4.4). |
| Researcher Affiliation | Academia | 1 Language Technology Lab, TAL, University of Cambridge, UK 2 Department of Computer and Information Science, University of Pennsylvania, USA 3 Viterbi School of Engineering, University of Southern California, USA |
| Pseudocode | Yes | Algorithm 1: Visual pivot induction. |
| Open Source Code | Yes | Code release: https://github.com/cambridgeltl/eva; project page: http://cogcomp.org/page/publication view/927. |
| Open Datasets | Yes | The experiments are conducted on DBP15k (Sun, Hu, and Li 2017) and DWY15k (Guo, Sun, and Hu 2019). |
| Dataset Splits | No | The paper states 'using 30% of the EA labels for training' and discusses test sets, but does not explicitly provide percentages or counts for a separate validation split or refer to a standard validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | No | The paper mentions software components like 'Adam W', 'RESNET-152', and 'FASTTEXT' but does not provide specific version numbers for them within the main text. |
| Experiment Setup | Yes | The GCN has two layers with input, hidden and output dimensions of 400, 400, 200 respectively. Attribute and relation features are mapped to 100-d. Images are transformed to 2048-d features by RESNET and then mapped to 200-d. For model variants without IL, training is limited to 500 epochs. Otherwise, after the first 500 epochs, IL is conducted for another 500 epochs with the configurations Ke = 5, Ks = 10 as described in 3.2. We train all models using a batch size of 7,500. The models are optimised using Adam W (Loshchilov and Hutter 2019) with a learning rate of 5e-4 and a weight decay of 1e-2. |