Bidirectional Transition-Based Dependency Parsing

Authors: Yunzhe Yuan, Yong Jiang, Kewei Tu7434-7441

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results show that our methods lead to competitive parsing accuracy and our method based on dynamic oracle consistently achieves the best performance. We evaluate our model on two popular treebanks (PTB and CTB) as well as eight additional treebanks from the Universal Dependency dataset.
Researcher Affiliation Academia Yunzhe Yuan, Yong Jiang, Kewei Tu School of Information Science and Technology Shanghai Tech University {yuanyzh,jiangyong,tukw}@shanghaitech.edu.cn
Pseudocode Yes The paper includes 'Algorithm 1 Greedy Transition-Based Parsing', 'Algorithm 2 Decoding with Dual Decomposition', and 'Algorithm 3 Decoding Guided by Dynamic Oracle'.
Open Source Code Yes Our code is based on Kiperwasser and Goldberg (2016) and is available at https://github.com/yuanyunzhe/bi-trans-parser.
Open Datasets Yes We evaluate our framework on the Wall Street Journal corpus of Penn Treebank (PTB), Chinese Treebank 5 (CTB), and eight additional treebanks of different languages from Universal Dependencies 2.2 (UD)2. For PTB, we use Stanford Dependencies (Silveira et al. 2014) to convert the original treebank to the dependency version and use the standard 2-21/22/23 split for training, development and testing. For CTB, we use Penn2Malt3 to convert the original tree-bank to the dependency version with the heading rules of (Zhang and Clark 2008). We use pretrained word embeddings from Glo Ve4 (Pennington, Socher, and Manning 2014) in our experiments on PTB and pretrained word embeddings from fast Text5 (Bojanowski et al. 2017) in our experiments on UD. We use word2vec (Mikolov et al. 2013) to train Chinese word embeddings on Wikipedia for our experiments on CTB.
Dataset Splits Yes For PTB, we use Stanford Dependencies (Silveira et al. 2014) to convert the original treebank to the dependency version and use the standard 2-21/22/23 split for training, development and testing. For CTB, we use Penn2Malt3 to convert the original tree-bank to the dependency version with the heading rules of (Zhang and Clark 2008). We use the training, development and testing split of the dataset following (Zhang and Clark 2008). See Table 2 for statistics of the datasets.
Hardware Specification No The paper does not specify any hardware details like GPU/CPU models or types of machines used for experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python version, deep learning framework version like PyTorch/TensorFlow, or library versions).
Experiment Setup Yes Table 5 shows all the other hyperparameters used in our experiments. Word embedding dimension 100, POS tag dimension 25, Bi LSTM layers 2, LSTM dimensions 200/200, MLP units 100. For dual decomposition, we decrease the update rate αk over time as recommended by (Rush and Collins 2012). Specifically, we set αk = cdd t+1 where t is the number of times the dual value increases and cdd is a hyperparameter. For our method based on dynamic oracle, we also decrease the reward cdo over time in a similar way to make the decoding process stabler. We then fix the unidirectional parsers and tune the hyperparameters of our joint decoding algorithms (cdd {0, 0.005, 0.01} and cdo {0, 1, 2, 3, 4}) based on the UAS on the development set.