Parsing as Pretraining
Authors: David Vilares, Michalina Strzyz, Anders Søgaard, Carlos Gómez-Rodríguez9114-9121
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | For evaluation, we use bracketing F1score and LAS, and analyze in-depth differences across representations for span lengths and dependency displacements. The overall results surpass existing sequence tagging parsers on the PTB (93.5%) and end-to-end EN-EWT UD (78.8%). |
| Researcher Affiliation | Collaboration | 1Universidade da Coru na, CITIC, Ciencias de la Computaci on y Tecnolog ıas de la Informaci on (CC&TI), A Coru na, Spain 2University of Copenhagen, Department of Computer Science, Copenhagen, Denmark 3Google Research, Berlin, Germany |
| Pseudocode | No | The paper describes the approach, including the mapping and postprocessing steps, and depicts the architecture in Figure 3, but does not present any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | The source code is accessible at https://github.com/aghie/parsing-as-pretraining. |
| Open Datasets | Yes | We use the English Penn Treebank (PTB) (Marcus, Santorini, and Marcinkiewicz 1993) for evaluation on constituent parsing, and the EN-EWT UD treebank (v2.2) for dependency parsing (Nivre and others 2017). |
| Dataset Splits | No | The paper mentions using 'train' and 'test' sets but does not explicitly provide details about a 'validation' dataset split or its proportions. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as CPU or GPU models. |
| Software Dependencies | No | The paper mentions using a 'pytorch wrapper' for BERT and building on the 'framework by Yang and Zhang (2018)' but does not provide specific version numbers for software dependencies such as PyTorch, Python, or CUDA. |
| Experiment Setup | Yes | For ff/lstm, the learning rate was set to 5e-4. |