reproducibilityindex.ai

CNN-Based Chinese NER with Lexicon Rethinking

Authors: Tao Gui, Ruotian Ma, Qi Zhang, Lujun Zhao, Yu-Gang Jiang, Xuanjing Huang

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on four datasets show that the proposed method can achieve better performance than both word-level and character-level baseline methods.
Researcher Affiliation	Collaboration	Tao Gui1 , Ruotian Ma1 , Qi Zhang1 , Lujun Zhao1 , Yu-Gang Jiang1,2 and Xuanjing Huang1 1School of Computer Science, Fudan University, Shanghai, China 2Jilian Technology Group (Video++), Shanghai, China
Pseudocode	No	No pseudocode or algorithm blocks were found.
Open Source Code	Yes	Our code are released at https://github.com/guitaowufeng/LR-CNN.
Open Datasets	Yes	We evaluate the proposed method on four datasets, including Onto Notes [Weischedel et al., 2011], MSRA [Levow, 2006], Weibo NER [Peng and Dredze, 2015; He and Sun, 2016], and Resume NER [Zhang and Yang, 2018].
Dataset Splits	Yes	Table 1: Statistics of datasets. Datasets Type Train Dev Test Onto Notes Sentence 15.7k 4.3k 4.3k Char 491.9k 200.5k 208.1k MSRA Sentence 46.4k 4.4k Char 2169.9k 172.6k Weibo Sentence 1.4k 0.27k 0.27k Char 73.8k 14.5 14.8k Resume Sentence 3.8k 0.46 0.48k Char 124.1k 13.9k 15.1k
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) used for experiments were provided in the paper.
Software Dependencies	No	The paper mentions Adamax optimization and word2vec, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	For all four of the datasets, we used the Adamax [Kingma and Ba, 2014] optimization to train our networks. The initial learning rate was set at 0.0015, with a decay rate of 0.05. To avoid overﬁtting, we employed the dropout technique (50% dropout rate) on the character embeddings, lexicon embeddings and each layer of the CNNs. The character embeddings and lexicon embeddings were initialized by a pretrained embedding and then ﬁne-tuned during the training. The character embedding size and lexicon embedding size were set to 50. For the biggest dataset, MSRA, we used ﬁve layers of CNNs with an output channel size of 300. For the other datasets, we used four layers of CNNs with an output channel size of 128. We used early stopping, based on the performance on the development set.