reproducibilityindex.ai

Mining Entity Synonyms with Efficient Neural Set Generation

Authors: Jiaming Shen, Ruiliang Lyu, Xiang Ren, Michelle Vanni, Brian Sadler, Jiawei Han249-256

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on three real datasets from different domains demonstrate both effectiveness and efficiency of Syn Set Mine for mining entity synonym sets.
Researcher Affiliation	Collaboration	Department of Computer Science, University of Illinois Urbana-Champaign, IL, USA Department of Electronic Engineering, Shanghai Jiao Tong University, China U.S. Army Research Laboratory, MD, USA Department of Computer Science, University of Southern California, CA, USA
Pseudocode	Yes	Algorithm 1: Set Generation Algorithm
Open Source Code	Yes	Our model implementation is available at: https://github.com/mickeystroller/Syn Set Mine-pytorch.
Open Datasets	Yes	We evaluate Syn Set Mine on three public benchmark datasets used in (Qu, Ren, and Han 2017): 1. Wiki contains 100K articles in the Wikipedia. We use Freebase1 as the knowledge base. ... All datasets are available at: http://bit.ly/Syn Set Mine-dataset.
Dataset Splits	Yes	We tune hyper-parameters in all (semi-)supervised algorithms using 5-fold cross validation on training set.
Hardware Specification	Yes	We implement our model based on the Py Torch library, same as the L2C baseline. We train two neural models (Syn Set Mine and L2C) on one Quadro P4000 GPU and run all the other methods on CPU.
Software Dependencies	No	The paper mentions implementing the model based on the 'Py Torch library' but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	For Syn Set Mine, we use a neural network with two hidden layers (of sizes 50, 250) as embedding transformer, and another neural network with three hidden layers (of sizes 250, 500, 250) as post transformer (c.f. Figure 3). We optimize our model using Adam with initial learning rate 0.001 and apply dropout technique with dropout rate 0.5. For the set generation algorithm, we set the probability threshold θ be 0.5.