Jointly Parse and Fragment Ungrammatical Sentences
Authors: Homa B. Hashemi, Rebecca Hwa
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that the proposed strategies are promising for detecting incorrect syntactic dependencies as well as incorrect semantic dependencies. |
| Researcher Affiliation | Academia | Homa B. Hashemi, Rebecca Hwa Intelligent Systems Program, Computer Science Department University of Pittsburgh hashemi@cs.pitt.edu, hwa@cs.pitt.edu |
| Pseudocode | No | The paper describes methods in text and provides examples, but does not include a formal pseudocode block or algorithm block. |
| Open Source Code | Yes | The code is available at: https://github.com/HHashemi/Dependency Arc Pruning |
| Open Datasets | Yes | We used three ESL corpora: First Certificate in English (FCE) dataset (Yannakoudakis, Briscoe, and Medlock 2011), National University of Singapore Corpus of Learner English (NUCLE) (Dahlmeier, Ng, and Wu 2013), and EF-Cambridge Open Language Database (EFCAMDAT) (Geertzen, Alexopoulou, and Korhonen 2013). |
| Dataset Splits | Yes | We then randomly separate 30,000 sentences as the development set, and the remaining 576,000 sentences as the training set. |
| Hardware Specification | No | The paper does not specify the exact hardware (e.g., GPU models, CPU models, memory) used for running experiments. |
| Software Dependencies | No | The paper mentions software like "Open NMT" and "Syntax Net" with citations, but does not provide specific version numbers for these or other key libraries/tools. |
| Experiment Setup | Yes | In our implementation of seq2seq RNNs, we used 2-layer LSTMs with 750 hidden units in each layer both for decoding and encoding modules. We trained the network with a batch size of 48 and a maximum sequence length of 62 and 123 for the source and target sequences, respectively. The parameters of the model were uniformly initialized in [ 0.1, 0.1], and the L2-normalized gradients were constrained to be 5 to prevent the gradient exploding effect. In the training phase, the learning rate schedule started at 1 and halved the learning rate after each epoch beyond epoch 10, or once the validation set perplexity no longer improved. We trained the network for up to 30 epochs choosing the model with the lowest perplexity on the validation set as the final model. |