COTSAE: CO-Training of Structure and Attribute Embeddings for Entity Alignment

Authors: Kai Yang, Shaoqin Liu, Junfeng Zhao, Yasha Wang, Bing Xie3025-3032

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verified our COTSAE on several datasets from real-world KGs, and the results showed that it is significantly better than the latest entity alignment methods. Experiments In this section, we report our experiments and results compared with several state-of-art methods on a set of real-world KG datasets.
Researcher Affiliation Academia 1Key Laboratory of High Confidence Software Technologies, Ministry of Education, Beijing 100871, China 2School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China 3National Engineering Research Center For Software Engineering, Peking University, 100871 4Peking University Information Technology Institute (Tianjin Binhai), Tianjin 300450, China
Pseudocode No The paper describes the proposed model and its components but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes We used Tensorflow to develop our approach COTSAE1. 1https://github.com/ykpku/COTSA
Open Datasets Yes To verify the performance of our model on the KGs with different data scales and entity sampling distributions, we use two datasets: DWY15K and DWY100K. The entities and structural triples are from (Guo, Sun, and Hu 2019) which guarantee the degree distributions of the sampled entities following the original KGs. We then extract all the attribute triples that involve the entities in the alignments from the original KGs(DBpedia, Wikidata, and YAGO3). ... The entities and structural triples are from (Sun et al. 2018).
Dataset Splits Yes Each dataset provides 30% entities as seed alignments by default and leaves the remaining for evaluating entity alignment performance.
Hardware Specification Yes Our experiments were conducted on a server with an Intel Xeon E5-2620 2.1 GHz CPU, 4 NVIDIA Ge Force Titan xp GPU and 128 GB memory.
Software Dependencies No The paper states 'We used Tensorflow to develop our approach COTSAE1' but does not provide specific version numbers for TensorFlow or any other software libraries used, which is necessary for reproducibility.
Experiment Setup Yes The hyper-parameters of COTSAE were used as below. For the Trans E component model, we followed (Sun et al. 2018) and set γ1 = 0.01, γ2 = 2.0, and μ1 = 0.2. And 10 negative relational triples were sampled for each positive triple. The entity dimensions of entities and relations were set to 75. The learning rate is set to 0.01, and the batch size is 2000. In Pseudo-Siamese Neural Network, we choose the most 51 frequent used characters, and the character embedding size is set to 32. The embedding size of attribute type is set to 64. The learning rate of Pseudo-Siamese Neural Network is set as 0.001, and the batch is set as 64. We set the weight size of joint attention as 32 and the size of the hidden layer of Bi-GRUs as 64.