reproducibilityindex.ai

Fast and Accurate Neural CRF Constituency Parsing

Authors: Yu Zhang, Houquan Zhou, Zhenghua Li

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on PTB, CTB5.1, and CTB7 show that our two-stage CRF parser achieves new state-of-the-art performance on both settings of w/o and w/ BERT, and can parse over 1,000 sentences per second.
Researcher Affiliation	Academia	Yu Zhang , Houquan Zhou , Zhenghua Li Institute of Artiﬁcial Intelligence, School of Computer Science and Technology, Soochow University, Suzhou, China yzhang.cs@outlook.com, hqzhou@stu.suda.edu.cn, zhli13@suda.edu.cn
Pseudocode	Yes	Algorithm 1 Batchiﬁed Inside Algorithm.
Open Source Code	Yes	We release our code at https://github.com/yzhangcs/crfpar.
Open Datasets	Yes	We conduct experiments on three English and Chinese datasets. The ﬁrst two datasets, i.e., PTB and CTB5.1, are widely used in the community. We follow the conventional train/dev/test data split. Table 1 shows the data statistics, including the number of sentences and constituent labels.
Dataset Splits	Yes	We follow the conventional train/dev/test data split. Considering that both CTB5.1dev/test only have about 350 sentences, we also use the larger CTB7 for more robust investigations, following the data split suggested in the ofﬁcial manual. Table 1 shows the data statistics.
Hardware Specification	Yes	Our models are both run on a machine with Intel Xeon E5-2650 v4 CPU and Nvidia Ge Force GTX 1080 Ti GPU.
Software Dependencies	No	The paper mentions tools like 'NLTK tool' and 'EVALB tool' but does not specify version numbers for any software dependencies, such as deep learning frameworks (e.g., PyTorch, TensorFlow) or specific library versions.
Experiment Setup	Yes	The dimensions of char embedding, word embedding, and Char LSTM outputs are 50, 100, 100, respectively. All dropout ratios are 0.33. The mini-batch size is 5,000 words. The training process continues at most 1,000 epochs and is stopped if the peak performance on dev data does not increase in 100 consecutive epochs.