Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Hierarchical Coherence Modeling for Document Quality Assessment
Authors: Dongliang Liao, Jin Xu, Gongfu Li, Yiru Wang13353-13361
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed method on two realistic tasks: news quality judgement and automated essay scoring. Experimental results demonstrate the validity and superiority of our work. |
| Researcher Affiliation | Industry | Dongliang Liao, Jin Xu*, Gongfu Li, Yiru Wang Data Quality Team, We Caht, Tencent Inc., China. EMAIL |
| Pseudocode | Yes | Algorithm 1: Text Coherence Modeling |
| Open Source Code | Yes | 1Code and Dataset: https://github.com/Bright Liao/Hier Coh |
| Open Datasets | Yes | 1Code and Dataset: https://github.com/Bright Liao/Hier Coh. For the AES task, The hidden sizes of H-Trans and proposed methods are empirically set as 64 for word embedding, Transformers and attention layers. We follow the 5-fold evaluation method with Taghipour and Ng (2016) and reuse the data preprocess code of Dong, Zhang, and Yang (2017). |
| Dataset Splits | Yes | We follow the 5-fold evaluation method with Taghipour and Ng (2016)... We sample 80% of news pairs as the training set, 10% news pairs as the validation set and 10% as the test set. |
| Hardware Specification | Yes | All experiments are constructed based on Tensor Flow with Tesla P40 GPU. |
| Software Dependencies | No | The paper mentions 'Tensor Flow' but does not specify a version number or other software dependencies with their versions, which is required for reproducibility. |
| Experiment Setup | Yes | We set the coherence vector size (i.e. the hidden size of bilinear layer) as 5 follows Tay et al. (2018). The window size k and layer number of max-coherence pooling L is fine tuned on {3, 5, 7, 11} and {1, 2, 4, 8} respectively. The strides p is set as the half of window size p = k/2 empirically. We adopt the Adam with 0.0005 learning rate for training and employ a dropout mechanism on the input word embedding with dropout rate 0.5. |