reproducibilityindex.ai

Representing Verbs as Argument Concepts

Authors: Yu Gong, Kaiqi Zhao, Kenny Zhu

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we ﬁrst show how we prepare the data for argument conceptualization. Then, we use some example concepts generated by our algorithm to show the advantage of our algorithm (AC) against selectional preference (SP), Frame Net (Baker, Fillmore, and Lowe 1998) and Re Verb (Fader, Soderland, and Etzioni 2011), as well as our baseline approach (BL) which considers equal weight for each argument (see Section 3). We also quantitatively evaluate the accuracies of AC, BL and SP on Probase. Finally, we apply our algorithm to an NLP task known as argument identiﬁcation (Gildea and Palmer 2002; Abend, Reichart, and Rappoport 2009; Meza-Ruiz and Riedel 2009) and show that concepts generated by AC achieve better accuracy against BL, SP, Reverb and a state-of-the-art semantic role labeling tool (using Frame Net) on both taxonomies.
Researcher Affiliation	Academia	Yu Gong 1 and Kaiqi Zhao 2 and Kenny Q. Zhu 3 Shanghai Jiao Tong University, Shanghai, China 1gy910210@163.com, 2kaiqi zhao@163.com, 3kzhu@cs.sjtu.edu.cn
Pseudocode	Yes	Algorithm 1 Argument Conceptualization
Open Source Code	No	2All evaluation data sets and results are available at http://adapt.seiee.sjtu.edu.cn/ac.
Open Datasets	Yes	We use our algorithm to conceptualize subjects and objects for 1770 common verbs from Google syntactic Ngram (Goldberg and Orwant 2013; Google 2013) using Probase and Word Net as is A taxonomies. 2 From 1770 verb set, we sample 100 verbs with probability proportional to the frequency of the verb. This set of 100 verbs (Verb-100) is used for quantitative experiments including evaluating the accuracy of argument concepts and the accuracy of argument identiﬁcation. All argument instances we use in this work come from Verbargs and Triarcs packages of the N-gram data. From the labeled dependency trees, we extract subject-verb dependency pairs (nsubj, agent) and object-verb dependency pairs (dobj, nsubjpass).
Dataset Splits	No	The paper describes the construction of a test set for evaluation but does not explicitly provide details about training or validation splits for model training or hyperparameter tuning.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions several software tools and resources (e.g., Google syntactic Ngram, Probase, Word Net, Semafor, Stanford Core NLP) but does not provide specific version numbers for these or for any programming languages or libraries used in the implementation.
Experiment Setup	Yes	For the system parameters, we set the maximum overlap threshold between two concepts to 0.2, and the number of concepts k to {5, 10, 15} to evaluate argument concepts of different granularity.