reproducibilityindex.ai

Discourse Relations Detection via a Mixed Generative-Discriminative Framework

Authors: Jifan Chen, Qi Zhang, Pengfei Liu, Xuanjing Huang

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on two different datasets show that the proposed method without using manually designed features can achieve better performance on recognizing the discourse level relations in most cases. Experiment We evaluated the proposed method on two datasets: the Penn Discourse Treebank (Miltsakaki et al. 2004) and explanatory relations in product reviews (Zhang et al. 2013).
Researcher Affiliation	Academia	Jifan Chen, Qi Zhang, Pengfei Liu, Xuanjing Huang Shanghai Key Laboratory of Data Science School of Computer Science, Fudan University 825 Zhangheng Road, Shanghai, P.R.China jfchen14, qz, pﬂiu14, xjhuang}@fudan.edu.cn
Pseudocode	No	The paper contains mathematical derivations and descriptive steps but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code for the methodology or links to a code repository.
Open Datasets	Yes	We evaluated the proposed method on two datasets: the Penn Discourse Treebank (Miltsakaki et al. 2004) and explanatory relations in product reviews (Zhang et al. 2013). The dataset we used in this work is Penn Discourse Treebank 2.0 (Prasad et al. 2008), which is one of the largest available annotated corpora of discourse relations.
Dataset Splits	Yes	We followed the recommended section partition of PDTB 2.0, which is to use sections 2-20 for training and sections 21-22 for testing (Prasad et al. 2008). We used a 10-fold cross-validation of the training set to select the optimal word embeddings as well as the number of Gaussian densities in the Gaussian Mixture Model (GMM).
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to conduct the experiments.
Software Dependencies	No	The paper mentions various word embedding models (e.g., 'Mikolov (2013)', 'Collobert et al. (2011)') and notes that '300-dimensional vectors pre-trained by Mikolov (2013) achieve the best performance.' It also mentions using a 'Random Forest Classiﬁer' and 'Gaussian Mixture Model (GMM)'. However, it does not specify exact version numbers for any libraries, frameworks, or specific software packages (e.g., scikit-learn version for Random Forest).
Experiment Setup	Yes	The optimal number of Gaussian densities in GMM is 16. As for the weights in Weighted Fisher Vector, it is reported in the work of Pitler et al.(2009) that the nouns, verbs and adjectives in the pair contribute more to the detection of its relation. In this experiment, we simply set the weight of the offset between nouns, verbs and adjectives to 2, and the others to 1.