Discontinuous Constituent Parsing with Pointer Networks

Authors: Daniel Fernández-González, Carlos Gómez-Rodríguez7724-7731

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on both widely-known NEGRA (Skut et al. 1997) and TIGER (Brants et al. 2002) treebanks, which contain a large number of German sentences syntactically annotated by constituent trees with a high degree of discontinuity: the prevalence of discontinuous constituent structures is over 25% in both treebanks (Maier and Lichte 2011). In both benchmarks, our approach achieves accuracies beyond 85% F-score (even without Part-of-Speech (POS) tagging information), surpassing the current state of the art by a wide margin without the need of orthogonal techniques such as re-ranking or semi-supervision.
Researcher Affiliation Academia Daniel Fern andez-Gonz alez, Carlos G omez-Rodr ıguez Universidade da Coru na, CITIC FASTPARSE Lab, Ly S Group Depto. de Ciencias de la Computaci on y Tecnolog ıas de la Informaci on Elvi na, 15071 A Coru na, Spain {d.fgonzalez, carlos.gomez}@udc.es
Pseudocode No The paper describes the neural network architecture and its components but does not provide any pseudocode or algorithm blocks.
Open Source Code Yes The resulting neural model1 produces the most accurate discontinuous constituent representations reported so far. 1Available at https://github.com/danifg/Disco Pointer
Open Datasets Yes We test our new approach on two widely-used discontinuous German treebanks: NEGRA (Skut et al. 1997) and TIGER (Brants et al. 2002).
Dataset Splits Yes For the latter [TIGER], we use the split provided in the SPMRL14 shared task (Crabb e 2014), and, for NEGRA, we follow the standard splits (Dubey and Keller 2003). ... In addition, during training, the model with the highest Labelled Attachment Score (LAS) on the augmented dependency version of the development set is chosen.
Hardware Specification No The paper does not provide any specific hardware details used for running the experiments. It only mentions the architecture and training settings.
Software Dependencies No The paper mentions software components like "Adam optimizer" and types of neural networks (e.g., "Bi LSTM-CNN architecture", "Pointer Network", "biaffine classifier") but does not provide specific version numbers for any libraries, frameworks, or software dependencies required for replication.
Experiment Setup Yes Table 1: Model hyper-parameters. CNN window size 3, CNN number of filters 50, Bi LSTM encoder layers 3, Bi LSTM encoder size 512, LSTM decoder layers 1, LSTM decoder size 512, POS tag/word/character embedding dimension 100, LSTM layers/embeddings dropout 0.33, MLP layers 1, MLP activation function ELU, Arc MLP size 512, Label MLP size 128. Adam optimizer hyper-parameters Initial learning rate 0.001, β1, β2 0.9, Decay rate 0.75, Gradient clipping 5.0.