reproducibilityindex.ai

GraphER: Token-Centric Entity Resolution with Graph Convolutional Neural Networks

Authors: Bing Li, Wei Wang, Yifang Sun, Linhan Zhang, Muhammad Asif Ali, Yi Wang8172-8179

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on two real-world datasets demonstrate that our model stably outperforms state-of-the-art models.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, University of New South Wales, Australia {bing.li, weiw, yifang.sun, muhammadasif.ali}@unsw.edu.au, linhan.zhang@student.unsw.edu.au 2Dongguan University of Technology, China wangyi@dgut.edu.cn
Pseudocode	No	The paper describes the model architecture and steps but does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	Yes	We used two datasets, each contains two tables, and a list of golden matches. ... Amazon-Google (K opcke, Thor, and Rahm 2010). ... Beer Advo-Rate Beer (Mudgal et al. 2018).
Dataset Splits	Yes	For both datasets, we use the same 3:1:1 train/dev/test split as in (Mudgal et al. 2018).
Hardware Specification	Yes	The Titan V used for this research was donated by the NVIDIA Corporation.
Software Dependencies	No	Token embeddings were initiallized using 300-dimensional pretrained vectors of Glove. While Glove is a software dependency, no specific version number is provided.
Experiment Setup	Yes	For ER-GCN, the size of Θ(1) was set to \|V\| 300, \|V\| was the number of nodes in corresponding ER-Graph, and the size of Θ(2) was 300 200. The textual window size was set to 20. Token embeddings were initiallized using 300-dimensional pretrained vectors of Glove1, while unknown words were initialized with an embedding drawn from a uniform distribution U( 0.25, 0.25). All the weight matrices in ER-GCN are initialized using Xavier initialization (Glorot and Bengio 2010) with gain 1. da in Eq. 10 was set to 350. For the CNN used in aggregation layer, we took three ﬁlter widths [1, 2, 3], each ﬁlter width having 150 kernels. For the ﬁnal prediction layer, the number of hidden units of Highway Net is set to 4000. For optimization, we used Adam (Kinga and Adam 2015) with an initial learning rate 0.001, dropout rate as 0.5, and the gradient clipping to 5; the batch size to be 32 and 3 for Amazon-Google and Beer Advo-Rate Beer dataset, respectively; all other hyper-parameters were their default values. We trained the model for a maximum of 100 epochs, and stopped training if the validation loss did not decrease by 10 consecutive epochs.