Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Implicit Discourse Relation Classification via Multi-Task Neural Networks
Authors: Yang Liu, Sujian Li, Xiaodong Zhang, Zhifang Sui
AAAI 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results on the PDTB implicit discourse relation classification task demonstrate that our model achieves significant gains over baseline systems. |
| Researcher Affiliation | Academia | 1 Key Laboratory of Computational Linguistics, Peking University, MOE, China 2 Collaborative Innovation Center for Language Ability, Xuzhou, Jiangsu, China EMAIL |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements or links about the open-source code for the described methodology. |
| Open Datasets | Yes | The Penn Discourse Treebank (PDTB) (Prasad et al. 2007)... RST-DT is based on the Rhetorical Structure Theory (RST) proposed by (Mann and Thompson 1988)... In our work, we adopt the New York Times (NYT) Corpus (Sandhaus 2008)... |
| Dataset Splits | Yes | We follow the setup of previous studies (Pitler, Louis, and Nenkova 2009), splitting the dataset into a a training set, development set, and test set. Sections 2-20 are used to train classifiers, Sections 0-1 to develop feature sets and tune models, and Section 21-22 to test the systems. |
| Hardware Specification | No | The paper does not provide any specific hardware details (like GPU or CPU models, or cloud computing instances) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'GloVe' and 'the Standford parser' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | Model Configuration... The learning rates are set as λ = 0.004, λe = 0.001. Each task has a set of hyper-parameters, including the window size of CNN h, the pooling size np, the number of filters nf, dimension of the task-specific representation nr, and the regulative ratios μ and μe. ... The detailed settings are shown in Table 6. |