CNN-Based Chinese NER with Lexicon Rethinking
Authors: Tao Gui, Ruotian Ma, Qi Zhang, Lujun Zhao, Yu-Gang Jiang, Xuanjing Huang
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on four datasets show that the proposed method can achieve better performance than both word-level and character-level baseline methods. |
| Researcher Affiliation | Collaboration | Tao Gui1 , Ruotian Ma1 , Qi Zhang1 , Lujun Zhao1 , Yu-Gang Jiang1,2 and Xuanjing Huang1 1School of Computer Science, Fudan University, Shanghai, China 2Jilian Technology Group (Video++), Shanghai, China |
| Pseudocode | No | No pseudocode or algorithm blocks were found. |
| Open Source Code | Yes | Our code are released at https://github.com/guitaowufeng/LR-CNN. |
| Open Datasets | Yes | We evaluate the proposed method on four datasets, including Onto Notes [Weischedel et al., 2011], MSRA [Levow, 2006], Weibo NER [Peng and Dredze, 2015; He and Sun, 2016], and Resume NER [Zhang and Yang, 2018]. |
| Dataset Splits | Yes | Table 1: Statistics of datasets. Datasets Type Train Dev Test Onto Notes Sentence 15.7k 4.3k 4.3k Char 491.9k 200.5k 208.1k MSRA Sentence 46.4k 4.4k Char 2169.9k 172.6k Weibo Sentence 1.4k 0.27k 0.27k Char 73.8k 14.5 14.8k Resume Sentence 3.8k 0.46 0.48k Char 124.1k 13.9k 15.1k |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for experiments were provided in the paper. |
| Software Dependencies | No | The paper mentions Adamax optimization and word2vec, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | For all four of the datasets, we used the Adamax [Kingma and Ba, 2014] optimization to train our networks. The initial learning rate was set at 0.0015, with a decay rate of 0.05. To avoid overfitting, we employed the dropout technique (50% dropout rate) on the character embeddings, lexicon embeddings and each layer of the CNNs. The character embeddings and lexicon embeddings were initialized by a pretrained embedding and then fine-tuned during the training. The character embedding size and lexicon embedding size were set to 50. For the biggest dataset, MSRA, we used five layers of CNNs with an output channel size of 300. For the other datasets, we used four layers of CNNs with an output channel size of 128. We used early stopping, based on the performance on the development set. |