Fast and Accurate Neural CRF Constituency Parsing

Authors: Yu Zhang, Houquan Zhou, Zhenghua Li

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on PTB, CTB5.1, and CTB7 show that our two-stage CRF parser achieves new state-of-the-art performance on both settings of w/o and w/ BERT, and can parse over 1,000 sentences per second.
Researcher Affiliation Academia Yu Zhang , Houquan Zhou , Zhenghua Li Institute of Artificial Intelligence, School of Computer Science and Technology, Soochow University, Suzhou, China yzhang.cs@outlook.com, hqzhou@stu.suda.edu.cn, zhli13@suda.edu.cn
Pseudocode Yes Algorithm 1 Batchified Inside Algorithm.
Open Source Code Yes We release our code at https://github.com/yzhangcs/crfpar.
Open Datasets Yes We conduct experiments on three English and Chinese datasets. The first two datasets, i.e., PTB and CTB5.1, are widely used in the community. We follow the conventional train/dev/test data split. Table 1 shows the data statistics, including the number of sentences and constituent labels.
Dataset Splits Yes We follow the conventional train/dev/test data split. Considering that both CTB5.1dev/test only have about 350 sentences, we also use the larger CTB7 for more robust investigations, following the data split suggested in the official manual. Table 1 shows the data statistics.
Hardware Specification Yes Our models are both run on a machine with Intel Xeon E5-2650 v4 CPU and Nvidia Ge Force GTX 1080 Ti GPU.
Software Dependencies No The paper mentions tools like 'NLTK tool' and 'EVALB tool' but does not specify version numbers for any software dependencies, such as deep learning frameworks (e.g., PyTorch, TensorFlow) or specific library versions.
Experiment Setup Yes The dimensions of char embedding, word embedding, and Char LSTM outputs are 50, 100, 100, respectively. All dropout ratios are 0.33. The mini-batch size is 5,000 words. The training process continues at most 1,000 epochs and is stopped if the peak performance on dev data does not increase in 100 consecutive epochs.