reproducibilityindex.ai

COTSAE: CO-Training of Structure and Attribute Embeddings for Entity Alignment

Authors: Kai Yang, Shaoqin Liu, Junfeng Zhao, Yasha Wang, Bing Xie3025-3032

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We veriﬁed our COTSAE on several datasets from real-world KGs, and the results showed that it is signiﬁcantly better than the latest entity alignment methods. Experiments In this section, we report our experiments and results compared with several state-of-art methods on a set of real-world KG datasets.
Researcher Affiliation	Academia	1Key Laboratory of High Conﬁdence Software Technologies, Ministry of Education, Beijing 100871, China 2School of Electronics Engineering and Computer Science, Peking University, Beijing 100871, China 3National Engineering Research Center For Software Engineering, Peking University, 100871 4Peking University Information Technology Institute (Tianjin Binhai), Tianjin 300450, China
Pseudocode	No	The paper describes the proposed model and its components but does not include structured pseudocode or algorithm blocks.
Open Source Code	Yes	We used Tensorﬂow to develop our approach COTSAE1. 1https://github.com/ykpku/COTSA
Open Datasets	Yes	To verify the performance of our model on the KGs with different data scales and entity sampling distributions, we use two datasets: DWY15K and DWY100K. The entities and structural triples are from (Guo, Sun, and Hu 2019) which guarantee the degree distributions of the sampled entities following the original KGs. We then extract all the attribute triples that involve the entities in the alignments from the original KGs(DBpedia, Wikidata, and YAGO3). ... The entities and structural triples are from (Sun et al. 2018).
Dataset Splits	Yes	Each dataset provides 30% entities as seed alignments by default and leaves the remaining for evaluating entity alignment performance.
Hardware Specification	Yes	Our experiments were conducted on a server with an Intel Xeon E5-2620 2.1 GHz CPU, 4 NVIDIA Ge Force Titan xp GPU and 128 GB memory.
Software Dependencies	No	The paper states 'We used Tensorﬂow to develop our approach COTSAE1' but does not provide specific version numbers for TensorFlow or any other software libraries used, which is necessary for reproducibility.
Experiment Setup	Yes	The hyper-parameters of COTSAE were used as below. For the Trans E component model, we followed (Sun et al. 2018) and set γ1 = 0.01, γ2 = 2.0, and μ1 = 0.2. And 10 negative relational triples were sampled for each positive triple. The entity dimensions of entities and relations were set to 75. The learning rate is set to 0.01, and the batch size is 2000. In Pseudo-Siamese Neural Network, we choose the most 51 frequent used characters, and the character embedding size is set to 32. The embedding size of attribute type is set to 64. The learning rate of Pseudo-Siamese Neural Network is set as 0.001, and the batch is set as 64. We set the weight size of joint attention as 32 and the size of the hidden layer of Bi-GRUs as 64.