Attention-based Belief or Disbelief Feature Extraction for Dependency Parsing

Authors: Haoyuan Peng, Lu Liu, Yi Zhou, Junying Zhou, Xiaoqing Zheng

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on various datasets show that our arc-specific feature extraction mechanism significantly improves the performance of bi-directional LSTM-based models by explicitly modeling long-distance dependencies. For both English and Chinese, the proposed model achieve a higher accuracy on dependency parsing task than most existing neural attention-based models.
Researcher Affiliation Academia School of Computer Science, Fudan University, Shanghai, China Shanghai Key Laboratory of Intelligent Information Processing
Pseudocode No The paper describes the model and algorithms using prose and mathematical formulas but does not provide structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code, such as a repository link or an explicit statement of code release, for the methodology described.
Open Datasets Yes We evaluated the performance of our parser on English and Chinese with the Penn English Treebank-3 (PTB) and Chinese Treebank as the datasets. For English, we adopted Stanford basic dependencies. Following the standard split of PTB, we used sections 2-21 for training, section 22 for development and 23 for testing. The POS tags were assigned by the Stanford tagger (Toutanova et al. 2003) with a tagging accuracy of 97.3%. For Chinese, we used the same setup as (Zhang and Clark 2008).
Dataset Splits Yes For English, we adopted Stanford basic dependencies. Following the standard split of PTB, we used sections 2-21 for training, section 22 for development and 23 for testing. ... For Chinese, we used the same setup as (Zhang and Clark 2008). In more detail, we used sections 001-815 and 1001-1136 for training, sections 886-931 and 1148-1151 for development and sections 816-885 and 1137-1147 for testing.
Hardware Specification No We trained our model on an Nvidia GPU card. This statement is too general and does not specify a particular GPU model or other hardware details for reproducibility.
Software Dependencies No The paper mentions using Adam optimizer and Dropout, but does not provide specific version numbers for any software libraries, frameworks, or tools used in the implementation or experimentation.
Experiment Setup Yes Hyperparameters were tuned on the development set of PTB, and their final configuration is listed in table 1. ... We used Adam (Kingma and Ba 2014) to optimize our model parameters with hyper-parameters recommended by the authors (i.e. learning rate = 0.001, β1 = 0.9 and β2 = 0.999). Dropout (Srivastava et al. 2014) is applied to our model by dropping nodes in LSTM layers with probability 0.5.