Mining Entity Synonyms with Efficient Neural Set Generation
Authors: Jiaming Shen, Ruiliang Lyu, Xiang Ren, Michelle Vanni, Brian Sadler, Jiawei Han249-256
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on three real datasets from different domains demonstrate both effectiveness and efficiency of Syn Set Mine for mining entity synonym sets. |
| Researcher Affiliation | Collaboration | Department of Computer Science, University of Illinois Urbana-Champaign, IL, USA Department of Electronic Engineering, Shanghai Jiao Tong University, China U.S. Army Research Laboratory, MD, USA Department of Computer Science, University of Southern California, CA, USA |
| Pseudocode | Yes | Algorithm 1: Set Generation Algorithm |
| Open Source Code | Yes | Our model implementation is available at: https://github.com/mickeystroller/Syn Set Mine-pytorch. |
| Open Datasets | Yes | We evaluate Syn Set Mine on three public benchmark datasets used in (Qu, Ren, and Han 2017): 1. Wiki contains 100K articles in the Wikipedia. We use Freebase1 as the knowledge base. ... All datasets are available at: http://bit.ly/Syn Set Mine-dataset. |
| Dataset Splits | Yes | We tune hyper-parameters in all (semi-)supervised algorithms using 5-fold cross validation on training set. |
| Hardware Specification | Yes | We implement our model based on the Py Torch library, same as the L2C baseline. We train two neural models (Syn Set Mine and L2C) on one Quadro P4000 GPU and run all the other methods on CPU. |
| Software Dependencies | No | The paper mentions implementing the model based on the 'Py Torch library' but does not specify a version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | For Syn Set Mine, we use a neural network with two hidden layers (of sizes 50, 250) as embedding transformer, and another neural network with three hidden layers (of sizes 250, 500, 250) as post transformer (c.f. Figure 3). We optimize our model using Adam with initial learning rate 0.001 and apply dropout technique with dropout rate 0.5. For the set generation algorithm, we set the probability threshold θ be 0.5. |