Detecting Information-Dense Texts in Multiple News Domains
Authors: Yinfei Yang, Ani Nenkova
AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We train a classifier based on lexical, discourse and unlexicalized syntactic features and test its performance on a set of manually annotated articles from business, U.S. international relations, sports and science domains. Our results indicate that the task is feasible and that both syntactic and lexical features are highly predictive for the distinction. We observe considerable variation of prediction accuracy across domains and find that domain-specific models are more accurate. |
| Researcher Affiliation | Collaboration | Yinfei Yang Amazon Inc. yinfyang@amazon.com Ani Nenkova University of Pennsylvania nenkova@seas.upenn.edu |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper. |
| Open Datasets | Yes | The data for our experiments comes from the New York Times (NYT) corpus (LDC2008T19). This corpus contains 20 years worth of NYT, along with metadata about the newspaper section in which the article appeared and manual summaries for many of the articles. |
| Dataset Splits | Yes | We perform 10-fold cross-validation on the automatically labeled data with all features combined, but also analyze the performance when only a given class of features is used. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment. |
| Experiment Setup | Yes | We trained a binary classifier using Lib SVM (R.-E. Fan and Lin 2008) with linear kernel and default parameter settings. |