Entity Synonym Discovery via Multipiece Bilateral Context Matching
Authors: Chenwei Zhang, Yaliang Li, Nan Du, Wei Fan, Philip S. Yu
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that the proposed model is able to detect synonym sets that are not observed during training on both generic and domain-specific datasets: Wiki+Freebase, Pub Med+UMLS, and Med Book+MKG, with up to 4.16% improvement in terms of Area Under the Curve and 3.19% in terms of Mean Average Precision compared to the best baseline method. |
| Researcher Affiliation | Collaboration | 1Amazon, Seattle, WA 98109 USA 2Alibaba Group, Bellevue, WA 98004 USA 3Tencent Medical AI Lab, Palo Alto, CA 94306 USA 4University of Illinois at Chicago, Chicago, IL 60607 USA |
| Pseudocode | No | The paper describes the model architecture and training objectives using equations and diagrams, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and data are available1. 1https://github.com/czhang99/SynonymNet |
| Open Datasets | Yes | The Wiki + Free Base and Pub Med + UMLS are public available datasets used in previous synonym discovery tasks [Qu et al., 2017]. The Med Book is a Chinese dataset collected by authors where we collect 0.51M pieces of contexts from Chinese medical textbooks as well as online medical question answering forums. |
| Dataset Splits | Yes | Table 1: Dataset Statistics. #VALID 394 386 661 |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU models, CPU specifications, or memory. |
| Software Dependencies | No | The paper mentions using "Stanford Core NLP package" and "Jieba" for preprocessing, and "skip-gram" for word vectors, but it does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We train the proposed model with a wide range of hyperparameter configurations, as shown in Table 5. For the model architecture, we vary the number of randomly sampled contexts P = Q for each entity from 1 to 20. Each piece of context is chunked by a maximum length of T. For the context encoder, we vary the hidden dimension d CE from 8 to 1024. The margin value m in triplet loss function is varied from 0.1 to 1.75. For the training, we try different optimizers, vary batch sizes and learning rates. We apply random search to obtain the best-performing hyperparameter setting on the validation dataset, listed in Table 6. |