A Bidirectional Multi-paragraph Reading Model for Zero-shot Entity Linking

Authors: Hongyin Tang, Xingwu Sun, Beihong Jin, Fuzheng Zhang13889-13897

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that our bidirectional model can capture long-range context dependencies and outperform the baseline model by 3-4% in terms of accuracy.
Researcher Affiliation Collaboration Hongyin Tang1,2,*, Xingwu Sun3, Beihong Jin1,2, , Fuzheng Zhang3 1State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences 2University of Chinese Academy of Sciences, Beijing China 3Meituan-Dianping Group, China
Pseudocode No The paper describes the model architecture and mathematical formulations (Equations 1-19) but does not provide structured pseudocode or an algorithm block.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the methodology is openly available.
Open Datasets Yes We conduct our experiments on the dataset which is proposed in (Logeswaran et al. 2019) and built using documents on Wikia*.
Dataset Splits Yes The training set has 49,275 labeled mentions while the validation and test sets both have 10,000 mentions.
Hardware Specification Yes The CPU computations were run on a Intel Xeon Processor 5118 CPU. The GPU computations were run on a single Nvidia Tesla V100 GPU.
Software Dependencies No The paper mentions using a BERT model and Adam optimizer but does not provide specific version numbers for software dependencies (e.g., Python, PyTorch/TensorFlow, specific BERT version).
Experiment Setup Yes We use Adam optimizer(Kingma and Ba 2015) with a learning rate of 2e-5 and warmup over the first 10% of total 10000 steps. The batch size is 16. During the fine-tuning stage, the length of paragraph m,n and the number of paragraphs lp,lq influence the performance significantly in terms of accuracy and inference time. We experiment several parameter settings of these parameters. Since the BERT-based baseline is under a setting of m = n = 128, for the sake of fairness, we set m = n = 128, lp = lq = 2 when comparing with the existing BERT-based baseline.