Discontinuous Constituent Parsing with Pointer Networks
Authors: Daniel Fernández-González, Carlos Gómez-Rodríguez7724-7731
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on both widely-known NEGRA (Skut et al. 1997) and TIGER (Brants et al. 2002) treebanks, which contain a large number of German sentences syntactically annotated by constituent trees with a high degree of discontinuity: the prevalence of discontinuous constituent structures is over 25% in both treebanks (Maier and Lichte 2011). In both benchmarks, our approach achieves accuracies beyond 85% F-score (even without Part-of-Speech (POS) tagging information), surpassing the current state of the art by a wide margin without the need of orthogonal techniques such as re-ranking or semi-supervision. |
| Researcher Affiliation | Academia | Daniel Fern andez-Gonz alez, Carlos G omez-Rodr ıguez Universidade da Coru na, CITIC FASTPARSE Lab, Ly S Group Depto. de Ciencias de la Computaci on y Tecnolog ıas de la Informaci on Elvi na, 15071 A Coru na, Spain {d.fgonzalez, carlos.gomez}@udc.es |
| Pseudocode | No | The paper describes the neural network architecture and its components but does not provide any pseudocode or algorithm blocks. |
| Open Source Code | Yes | The resulting neural model1 produces the most accurate discontinuous constituent representations reported so far. 1Available at https://github.com/danifg/Disco Pointer |
| Open Datasets | Yes | We test our new approach on two widely-used discontinuous German treebanks: NEGRA (Skut et al. 1997) and TIGER (Brants et al. 2002). |
| Dataset Splits | Yes | For the latter [TIGER], we use the split provided in the SPMRL14 shared task (Crabb e 2014), and, for NEGRA, we follow the standard splits (Dubey and Keller 2003). ... In addition, during training, the model with the highest Labelled Attachment Score (LAS) on the augmented dependency version of the development set is chosen. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running the experiments. It only mentions the architecture and training settings. |
| Software Dependencies | No | The paper mentions software components like "Adam optimizer" and types of neural networks (e.g., "Bi LSTM-CNN architecture", "Pointer Network", "biaffine classifier") but does not provide specific version numbers for any libraries, frameworks, or software dependencies required for replication. |
| Experiment Setup | Yes | Table 1: Model hyper-parameters. CNN window size 3, CNN number of filters 50, Bi LSTM encoder layers 3, Bi LSTM encoder size 512, LSTM decoder layers 1, LSTM decoder size 512, POS tag/word/character embedding dimension 100, LSTM layers/embeddings dropout 0.33, MLP layers 1, MLP activation function ELU, Arc MLP size 512, Label MLP size 128. Adam optimizer hyper-parameters Initial learning rate 0.001, β1, β2 0.9, Decay rate 0.75, Gradient clipping 5.0. |