Discourse Relations Detection via a Mixed Generative-Discriminative Framework
Authors: Jifan Chen, Qi Zhang, Pengfei Liu, Xuanjing Huang
AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on two different datasets show that the proposed method without using manually designed features can achieve better performance on recognizing the discourse level relations in most cases. Experiment We evaluated the proposed method on two datasets: the Penn Discourse Treebank (Miltsakaki et al. 2004) and explanatory relations in product reviews (Zhang et al. 2013). |
| Researcher Affiliation | Academia | Jifan Chen, Qi Zhang, Pengfei Liu, Xuanjing Huang Shanghai Key Laboratory of Data Science School of Computer Science, Fudan University 825 Zhangheng Road, Shanghai, P.R.China jfchen14, qz, pfliu14, xjhuang}@fudan.edu.cn |
| Pseudocode | No | The paper contains mathematical derivations and descriptive steps but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code for the methodology or links to a code repository. |
| Open Datasets | Yes | We evaluated the proposed method on two datasets: the Penn Discourse Treebank (Miltsakaki et al. 2004) and explanatory relations in product reviews (Zhang et al. 2013). The dataset we used in this work is Penn Discourse Treebank 2.0 (Prasad et al. 2008), which is one of the largest available annotated corpora of discourse relations. |
| Dataset Splits | Yes | We followed the recommended section partition of PDTB 2.0, which is to use sections 2-20 for training and sections 21-22 for testing (Prasad et al. 2008). We used a 10-fold cross-validation of the training set to select the optimal word embeddings as well as the number of Gaussian densities in the Gaussian Mixture Model (GMM). |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to conduct the experiments. |
| Software Dependencies | No | The paper mentions various word embedding models (e.g., 'Mikolov (2013)', 'Collobert et al. (2011)') and notes that '300-dimensional vectors pre-trained by Mikolov (2013) achieve the best performance.' It also mentions using a 'Random Forest Classifier' and 'Gaussian Mixture Model (GMM)'. However, it does not specify exact version numbers for any libraries, frameworks, or specific software packages (e.g., scikit-learn version for Random Forest). |
| Experiment Setup | Yes | The optimal number of Gaussian densities in GMM is 16. As for the weights in Weighted Fisher Vector, it is reported in the work of Pitler et al.(2009) that the nouns, verbs and adjectives in the pair contribute more to the detection of its relation. In this experiment, we simply set the weight of the offset between nouns, verbs and adjectives to 2, and the others to 1. |