Importance-Aware Learning for Neural Headline Editing

Authors: Qingyang Wu, Lei Li, Hao Zhou, Ying Zeng, Zhou Yu9282-9289

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results show that our method significantly improves the quality of headline editing comparing against previous methods.
Researcher Affiliation Collaboration 1University of California, Davis, 2Byte Dance, {wilwu, joyu}@ucdavis.edu, {lileilab,zhouhao.nlp,zengying.ss}@bytedance.com
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes We have released the pre-trained Chinese-GPT model. 1 http://https://github.com/qywu/Chinese-GPT
Open Datasets No The paper describes collecting the Professional Headline Editing Dataset (PHED) and a Large Scale Chinese Corpus for NLP, providing a link for the latter: 'We collect our corpus from Large Scale Chinese Corpus for NLP 2. http://github.com/brightmart/nlp chinese corpus'. However, the PHED dataset, which is central to their main task, is described as 'constructed' by the authors, and no concrete access information (link, DOI, specific citation for access) is provided for it.
Dataset Splits No The paper mentions stopping pre-training based on 'validation perplexity' and selecting samples 'from the test set (1,500 samples in total)', but it does not provide specific training/validation/test split percentages or explicit counts for all splits of the PHED dataset.
Hardware Specification No The paper does not provide specific details about the hardware used, such as GPU/CPU models or specific cloud instances.
Software Dependencies No The paper mentions using Transformer architecture and BERT, but does not provide specific version numbers for any software libraries or dependencies.
Experiment Setup Yes We conduct hyper-parameters search for finding the best α and β for SIA. ... We set SIA s α = 0.2 and β = 40.0. ... during training the batches will be fed in the same order. ... During inference, we apply beam search decoding with beam size 10 for all models. We add the length normalization technique (Wu et al. 2016). The temperature is set to be 1.0 as it yields the best result.