Implicit Argument Prediction as Reading Comprehension
Authors: Pengxiang Cheng, Katrin Erk6284-6291
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our model shows good performance on an argument cloze task as well as on a nominal implicit argument prediction task. and 5 Empirical Results 5.1 Training Data Preprocessing We construct a large scale training dataset from the full English Wikipedia corpus. and 5.2 Evaluation on Onto Notes Dataset and 5.3 Evaluation on G&C Dataset |
| Researcher Affiliation | Academia | Pengxiang Cheng Department of Computer Science The University of Texas at Austin pxcheng@utexas.edu Katrin Erk Department of Linguistics The University of Texas at Austin katrin.erk@mail.utexas.edu |
| Pseudocode | No | The paper describes the model architecture with diagrams and mathematical equations but does not present any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/pxch/imp arg rc. |
| Open Datasets | Yes | We construct a large scale training dataset from the full English Wikipedia corpus. and Our main evaluation is on the argument cloze task using the Onto Notes datasets of Cheng and Erk (2018). and The implicit argument dataset by Gerber and Chai (2010 2012) is a very small dataset with 966 annotated implicit arguments... |
| Dataset Splits | Yes | for each testing fold, the model is tuned on the other nine folds. |
| Hardware Specification | No | The paper mentions using 'grid resources' from the 'Texas Advanced Computing Center' and the 'Chameleon testbed' but does not provide specific hardware details such as GPU or CPU models. |
| Software Dependencies | No | The paper mentions software components like 'Stanford Core NLP', 'word2vec embeddings', 'Bi GRU', and 'Adagrad optimizer' but does not provide specific version numbers for any of them. |
| Experiment Setup | Yes | We use a hidden size of 300 in both document encoder and query encoder, and apply a dropout layer with a rate of 0.2 on all embeddings before they are passed to the encoders. We train the model for 10 epochs with a batch size of 128, using Adagrad optimizer (Duchi, Hazan, and Singer 2011) to minimize the negative log-likelihood loss as defined in Equation 2 with a learning rate of 0.01. The hyperparameters are: (B = 4, λ = 1.0), (B = 8, λ = 1.0), (B = 16, λ = 1.0), (B = 8, λ = 0.1), and (B = 8, λ = 0.0), where B is the batch size and λ is the ℓ2 regularizer weight. |