Attention-based Belief or Disbelief Feature Extraction for Dependency Parsing
Authors: Haoyuan Peng, Lu Liu, Yi Zhou, Junying Zhou, Xiaoqing Zheng
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various datasets show that our arc-specific feature extraction mechanism significantly improves the performance of bi-directional LSTM-based models by explicitly modeling long-distance dependencies. For both English and Chinese, the proposed model achieve a higher accuracy on dependency parsing task than most existing neural attention-based models. |
| Researcher Affiliation | Academia | School of Computer Science, Fudan University, Shanghai, China Shanghai Key Laboratory of Intelligent Information Processing |
| Pseudocode | No | The paper describes the model and algorithms using prose and mathematical formulas but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code, such as a repository link or an explicit statement of code release, for the methodology described. |
| Open Datasets | Yes | We evaluated the performance of our parser on English and Chinese with the Penn English Treebank-3 (PTB) and Chinese Treebank as the datasets. For English, we adopted Stanford basic dependencies. Following the standard split of PTB, we used sections 2-21 for training, section 22 for development and 23 for testing. The POS tags were assigned by the Stanford tagger (Toutanova et al. 2003) with a tagging accuracy of 97.3%. For Chinese, we used the same setup as (Zhang and Clark 2008). |
| Dataset Splits | Yes | For English, we adopted Stanford basic dependencies. Following the standard split of PTB, we used sections 2-21 for training, section 22 for development and 23 for testing. ... For Chinese, we used the same setup as (Zhang and Clark 2008). In more detail, we used sections 001-815 and 1001-1136 for training, sections 886-931 and 1148-1151 for development and sections 816-885 and 1137-1147 for testing. |
| Hardware Specification | No | We trained our model on an Nvidia GPU card. This statement is too general and does not specify a particular GPU model or other hardware details for reproducibility. |
| Software Dependencies | No | The paper mentions using Adam optimizer and Dropout, but does not provide specific version numbers for any software libraries, frameworks, or tools used in the implementation or experimentation. |
| Experiment Setup | Yes | Hyperparameters were tuned on the development set of PTB, and their final configuration is listed in table 1. ... We used Adam (Kingma and Ba 2014) to optimize our model parameters with hyper-parameters recommended by the authors (i.e. learning rate = 0.001, β1 = 0.9 and β2 = 0.999). Dropout (Srivastava et al. 2014) is applied to our model by dropping nodes in LSTM layers with probability 0.5. |