reproducibilityindex.ai

Dynamic Modeling Cross- and Self-Lattice Attention Network for Chinese NER

Authors: Shan Zhao, Minghao Hu, Zhiping Cai, Haiwen Chen, Fang Liu14515-14523

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on four Chinese NER datasets show that DCSAN obtains stateof-the-art results as well as efﬁciency compared to several competitive approaches.
Researcher Affiliation	Academia	1College of Computer, National University of Defense Technology, Changsha, China 2Information Research Center of Military Science, PLA Academy of Military Science, Beijing, China 3School of Design, Hunan University, Changsha, Hunan
Pseudocode	No	The paper describes the proposed model and its components in detail using text and mathematical equations, but it does not include a separate pseudocode block or algorithm figure.
Open Source Code	Yes	We will release the source code to facilitate future research in this ﬁeld. 1https://github.com/zs50910/DCSAN-for-Chinese-NER
Open Datasets	Yes	We conduct experiments on four datasets, including Weibo NER (Peng and Dredze 2015), MSRA (Levow 2006), Chinese resume dataset (Zhang and Yang 2018), and E-commerce NER (Ding et al. 2019).
Dataset Splits	Yes	Table 1: Statistics of four Chinese NER datasets. Dataset Type Train Dev Test Weibo Char 73.8K 14.5K 14.8K E-commerce Char 119.1K 14.9K 14.7K Resume Char 124.1K 139K 15.1K MSRA Char 2169.9K 172.6K
Hardware Specification	No	The paper mentions 'GPU parallelism' and 'GPU' in general terms but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments.
Software Dependencies	No	The paper mentions using BERT embeddings and a word embedding dictionary, but it does not specify version numbers for any software dependencies or libraries (e.g., Python, PyTorch, TensorFlow, CUDA).
Experiment Setup	Yes	As for hyper-parameter conﬁgurations, the sizes of character embeddings is 768 and word embeddings is 200 by default, and the dimensionality of hidden size is 768. For attention settings, the head number of cross-lattice attention and dynamic self-lattice attention are 8 and 4 respectively for all datasets. We set the number of self-lattice attention layers l as 2 by default. To avoid overﬁtting, we regularize our network using dropout with a rate tuned on the development set. To train the model, we use SGD optimizer with a learning rate of 0.0007 on Resume, MSRA, and E-commerce datasets and 0.001 on the Weibo dataset. The training takes 100 epochs until convergence.