Hierarchical Coherence Modeling for Document Quality Assessment

Authors: Dongliang Liao, Jin Xu, Gongfu Li, Yiru Wang13353-13361

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the proposed method on two realistic tasks: news quality judgement and automated essay scoring. Experimental results demonstrate the validity and superiority of our work.
Researcher Affiliation Industry Dongliang Liao, Jin Xu*, Gongfu Li, Yiru Wang Data Quality Team, We Caht, Tencent Inc., China. {brightliao, jinxxu, gongfuli,dorisyrwang}@tencent.com
Pseudocode Yes Algorithm 1: Text Coherence Modeling
Open Source Code Yes 1Code and Dataset: https://github.com/Bright Liao/Hier Coh
Open Datasets Yes 1Code and Dataset: https://github.com/Bright Liao/Hier Coh. For the AES task, The hidden sizes of H-Trans and proposed methods are empirically set as 64 for word embedding, Transformers and attention layers. We follow the 5-fold evaluation method with Taghipour and Ng (2016) and reuse the data preprocess code of Dong, Zhang, and Yang (2017).
Dataset Splits Yes We follow the 5-fold evaluation method with Taghipour and Ng (2016)... We sample 80% of news pairs as the training set, 10% news pairs as the validation set and 10% as the test set.
Hardware Specification Yes All experiments are constructed based on Tensor Flow with Tesla P40 GPU.
Software Dependencies No The paper mentions 'Tensor Flow' but does not specify a version number or other software dependencies with their versions, which is required for reproducibility.
Experiment Setup Yes We set the coherence vector size (i.e. the hidden size of bilinear layer) as 5 follows Tay et al. (2018). The window size k and layer number of max-coherence pooling L is fine tuned on {3, 5, 7, 11} and {1, 2, 4, 8} respectively. The strides p is set as the half of window size p = k/2 empirically. We adopt the Adam with 0.0005 learning rate for training and employ a dropout mechanism on the input word embedding with dropout rate 0.5.