Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Combining Lexical and Syntactic Features for Detecting Content-Dense Texts in News

Authors: Yinfei Yang, Ani Nenkova

JAIR 2017 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Here we empirically test this assumption on news articles from the business, U.S. international relations, sports and science journalism domains. Our findings clearly indicate that about half of the news texts in our study are in fact not content-dense and motivate the development of a supervised content-density detector. We heuristically label a large training corpus for the task and train a two-layer classifying model based on lexical and unlexicalized syntactic features. On manually annotated data, we compare the performance of domain-specific classifiers, trained on data only from a given news domain and a general classifier in which data from all four domains is pooled together. Our annotation and prediction experiments demonstrate that the concept of content density varies depending on the domain and that naive annotators provide judgement biased toward the stereotypical domain label.
Researcher Affiliation Collaboration Yinfei Yang EMAIL 1600 Amphitheatre Pkwy Mountain View, CA 94043 Ani Nenkova EMAIL University of Pennsylvania 3330 Walnut Street Philadelphia, PA, 19103 USA
Pseudocode No The paper describes the methodology using textual explanations and diagrams (Figure 2) but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No All data for the work presented in this paper and the domain-dependent and general classifiers will be made publicly with the publication of this article.
Open Datasets Yes The data for our experiments comes from the New York Times (NYT) annotated corpus (LDC Catalog No. LDC2008T19). The corpus contains 20 years worthy of NYT editions, along with rich meta-data about the newspaper section in which the article appeared and summaries produced by information scientists for many of the articles. The leads of articles are explicitly marked in the corpus, so extracting the relevant text for further analysis is straightforward.
Dataset Splits Yes We perform 10-fold cross-validation experiments on the entire heuristically labeled data. The entire dataset is split into 10 partitions. At each run, five partitions are used for training first-stage classifiers and the feature-level combination classifier. Four partitions are used for training the second-stage combination classifier, which uses only the probabilities of the content-dense class from the first stage classifiers. One partition is used for testing the classifiers.
Hardware Specification No The paper does not explicitly describe the hardware used for running the experiments. It focuses on the models and data without mentioning specific CPUs, GPUs, or other hardware components.
Software Dependencies Yes In the feature-level combination system, we train the binary classifier using Liblinear (R.E. Fan & Lin, 2008) with L2-regularized logistic regression model setting. In the decisionlevel combination experiments, we first train binary classifiers based on each feature representation using Lib Linear with the same settings. Using the probability outputs (for the content-dense class) of the first stage classifiers as features, we then train a final binary classifier using Lib SVM (Chang & Lin, 2011) with linear kernel. Grid search is used on training and development set to find the best hyper-parameters in all models. ... Stanford Core NLP package (Manning, Surdeanu, Bauer, Finkel, Bethard, & Mc Closky, 2014) is used to extract production rules.
Experiment Setup Yes In the feature-level combination system, we train the binary classifier using Liblinear (R.E. Fan & Lin, 2008) with L2-regularized logistic regression model setting. In the decisionlevel combination experiments, we first train binary classifiers based on each feature representation using Lib Linear with the same settings. Using the probability outputs (for the content-dense class) of the first stage classifiers as features, we then train a final binary classifier using Lib SVM (Chang & Lin, 2011) with linear kernel. Grid search is used on training and development set to find the best hyper-parameters in all models.