Implicit Argument Prediction as Reading Comprehension

Authors: Pengxiang Cheng, Katrin Erk6284-6291

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our model shows good performance on an argument cloze task as well as on a nominal implicit argument prediction task. and 5 Empirical Results 5.1 Training Data Preprocessing We construct a large scale training dataset from the full English Wikipedia corpus. and 5.2 Evaluation on Onto Notes Dataset and 5.3 Evaluation on G&C Dataset
Researcher Affiliation Academia Pengxiang Cheng Department of Computer Science The University of Texas at Austin pxcheng@utexas.edu Katrin Erk Department of Linguistics The University of Texas at Austin katrin.erk@mail.utexas.edu
Pseudocode No The paper describes the model architecture with diagrams and mathematical equations but does not present any pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/pxch/imp arg rc.
Open Datasets Yes We construct a large scale training dataset from the full English Wikipedia corpus. and Our main evaluation is on the argument cloze task using the Onto Notes datasets of Cheng and Erk (2018). and The implicit argument dataset by Gerber and Chai (2010 2012) is a very small dataset with 966 annotated implicit arguments...
Dataset Splits Yes for each testing fold, the model is tuned on the other nine folds.
Hardware Specification No The paper mentions using 'grid resources' from the 'Texas Advanced Computing Center' and the 'Chameleon testbed' but does not provide specific hardware details such as GPU or CPU models.
Software Dependencies No The paper mentions software components like 'Stanford Core NLP', 'word2vec embeddings', 'Bi GRU', and 'Adagrad optimizer' but does not provide specific version numbers for any of them.
Experiment Setup Yes We use a hidden size of 300 in both document encoder and query encoder, and apply a dropout layer with a rate of 0.2 on all embeddings before they are passed to the encoders. We train the model for 10 epochs with a batch size of 128, using Adagrad optimizer (Duchi, Hazan, and Singer 2011) to minimize the negative log-likelihood loss as defined in Equation 2 with a learning rate of 0.01. The hyperparameters are: (B = 4, λ = 1.0), (B = 8, λ = 1.0), (B = 16, λ = 1.0), (B = 8, λ = 0.1), and (B = 8, λ = 0.0), where B is the batch size and λ is the ℓ2 regularizer weight.