MGAD: Learning Descriptional Representation Distilled from Distributional Semantics for Unseen Entities
Authors: Yuanzheng Wang, Xueqi Cheng, Yixing Fan, Xiaofei Zhu, Huasheng Liang, Qiang Yan, Jiafeng Guo
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on four benchmark datasets show that our approach improves the performance over all baseline methods. |
| Researcher Affiliation | Collaboration | 1CAS Key Lab of Network Data Science and Technology, ICT, CAS, Beijing, China 2University of Chinese Academy of Sciences, Beijing, China 3College of Computer Science and Engineering, Chongqing University of Technology 4We Chat, Tencent, Guangzhou, China |
| Pseudocode | No | The paper describes the model architecture and loss functions, but it does not provide any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our data, code and models are available at https://github.com/ dalek-who/MGAD-entity-linking |
| Open Datasets | Yes | To evaluate the performance of our model, we choose four widely used entity linking datasets: AIDA [Hoffart et al., 2011], ACE [Ratinov et al., 2011] , AQUAINT [Milne and Witten, 2008] and MSNBC [Cucerzan, 2007]. |
| Dataset Splits | Yes | for further improving the matching ability, we infer the entity embeddings of Eseen, and finetune the mention encoder of MGAD on AIDA-train with entity embeddings fixed. |
| Hardware Specification | No | The paper does not specify any hardware details (e.g., GPU models, CPU types, or cloud computing instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions using RoBERTa and BERT-based models, and the Adam optimizer, but it does not provide specific version numbers for any software libraries, frameworks, or languages (e.g., PyTorch 1.9, Python 3.8). |
| Experiment Setup | Yes | The best weights of four losses are α1 = 0.338, α2 = 0.002, α3 = 0.33, α4 = 0.33, where the sum of weights equals to 1. α2 is much smaller since the value of Lea is 102 times greater than others. Temperature τ = 1 in Eq. (8) and (12), and τ = 2 in Eq. (10). The optimizer is Adam [Kingma and Ba, 2015] with learning rate 2 × 10−5 and weight decay 0.01, linearwarmup learning rate scheduler with first 10% warmup steps. |