reproducibilityindex.ai

Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance

Authors: Dong Zhang, Suzhong Wei, Shoushan Li, Hanqian Wu, Qiaoming Zhu, Guodong Zhou14347-14355

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimentation on the two benchmark datasets demonstrates the superiority of our MNER model.
Researcher Affiliation	Academia	1 School of Computer Science and Technology, Soochow University, China 2 School of Computer Science and Engineering, Southeast University, China
Pseudocode	No	The paper describes the model architecture and processes verbally and mathematically, but does not provide a clearly labeled pseudocode block or algorithm.
Open Source Code	Yes	To motivate future research, the code3 will be released in our homepage. 3https://github.com/MANLP-suda/UMGF
Open Datasets	Yes	Following (Yu et al. 2020), we ﬁrst use two public Twitter datasets (i.e., Twitter-2015 and Twitter-2017) for MNER, which are provided by (Zhang et al. 2018) and (Lu et al. 2018), respectively.
Dataset Splits	Yes	Table 1 shows the number of entities for each type and the size of data split.
Hardware Specification	Yes	For all neural models, we conduct all the experiments on NVIDIA GTX 1080 Ti GPUs with pytorch 1.7.
Software Dependencies	Yes	For all neural models, we conduct all the experiments on NVIDIA GTX 1080 Ti GPUs with pytorch 1.7.
Experiment Setup	Yes	the maximum length of the sentence input and the batch size are respectively set to 128 and 16. For our approach, the word embeddings X are initialized with the cased BERTbase model pre-trained by Devlin et al. (2019) with dimension of 768, and ﬁne-tuned during training. The visual embeddings are initialized by Res Net152 with dimension of 2048 and ﬁnetuned during training. After MLPs, the dimension d of each node is transformed into 512. The head size in multi-head attention is set as 8. The learning rate, the dropout rate, and the tradeoff parameter are respectively set to 1e-4, 0.5, and 0.5, which can achieve the best performance on the development set of both datasets via a small grid search over the combinations of [1e-5, 1e-4], [0.1, 0.5], and [0.1, 0.9]. Based on best-performed development results, the layer number of multi-modal graph fusion is 2.