Mining Entity Synonyms with Efficient Neural Set Generation

Authors: Jiaming Shen, Ruiliang Lyu, Xiang Ren, Michelle Vanni, Brian Sadler, Jiawei Han249-256

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on three real datasets from different domains demonstrate both effectiveness and efficiency of Syn Set Mine for mining entity synonym sets.
Researcher Affiliation Collaboration Department of Computer Science, University of Illinois Urbana-Champaign, IL, USA Department of Electronic Engineering, Shanghai Jiao Tong University, China U.S. Army Research Laboratory, MD, USA Department of Computer Science, University of Southern California, CA, USA
Pseudocode Yes Algorithm 1: Set Generation Algorithm
Open Source Code Yes Our model implementation is available at: https://github.com/mickeystroller/Syn Set Mine-pytorch.
Open Datasets Yes We evaluate Syn Set Mine on three public benchmark datasets used in (Qu, Ren, and Han 2017): 1. Wiki contains 100K articles in the Wikipedia. We use Freebase1 as the knowledge base. ... All datasets are available at: http://bit.ly/Syn Set Mine-dataset.
Dataset Splits Yes We tune hyper-parameters in all (semi-)supervised algorithms using 5-fold cross validation on training set.
Hardware Specification Yes We implement our model based on the Py Torch library, same as the L2C baseline. We train two neural models (Syn Set Mine and L2C) on one Quadro P4000 GPU and run all the other methods on CPU.
Software Dependencies No The paper mentions implementing the model based on the 'Py Torch library' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup Yes For Syn Set Mine, we use a neural network with two hidden layers (of sizes 50, 250) as embedding transformer, and another neural network with three hidden layers (of sizes 250, 500, 250) as post transformer (c.f. Figure 3). We optimize our model using Adam with initial learning rate 0.001 and apply dropout technique with dropout rate 0.5. For the set generation algorithm, we set the probability threshold θ be 0.5.