Entity Synonym Discovery via Multipiece Bilateral Context Matching

Authors: Chenwei Zhang, Yaliang Li, Nan Du, Wei Fan, Philip S. Yu

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that the proposed model is able to detect synonym sets that are not observed during training on both generic and domain-specific datasets: Wiki+Freebase, Pub Med+UMLS, and Med Book+MKG, with up to 4.16% improvement in terms of Area Under the Curve and 3.19% in terms of Mean Average Precision compared to the best baseline method.
Researcher Affiliation Collaboration 1Amazon, Seattle, WA 98109 USA 2Alibaba Group, Bellevue, WA 98004 USA 3Tencent Medical AI Lab, Palo Alto, CA 94306 USA 4University of Illinois at Chicago, Chicago, IL 60607 USA
Pseudocode No The paper describes the model architecture and training objectives using equations and diagrams, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code and data are available1. 1https://github.com/czhang99/SynonymNet
Open Datasets Yes The Wiki + Free Base and Pub Med + UMLS are public available datasets used in previous synonym discovery tasks [Qu et al., 2017]. The Med Book is a Chinese dataset collected by authors where we collect 0.51M pieces of contexts from Chinese medical textbooks as well as online medical question answering forums.
Dataset Splits Yes Table 1: Dataset Statistics. #VALID 394 386 661
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU specifications, or memory.
Software Dependencies No The paper mentions using "Stanford Core NLP package" and "Jieba" for preprocessing, and "skip-gram" for word vectors, but it does not provide specific version numbers for these software components.
Experiment Setup Yes We train the proposed model with a wide range of hyperparameter configurations, as shown in Table 5. For the model architecture, we vary the number of randomly sampled contexts P = Q for each entity from 1 to 20. Each piece of context is chunked by a maximum length of T. For the context encoder, we vary the hidden dimension d CE from 8 to 1024. The margin value m in triplet loss function is varied from 0.1 to 1.75. For the training, we try different optimizers, vary batch sizes and learning rates. We apply random search to obtain the best-performing hyperparameter setting on the validation dataset, listed in Table 6.