Importance-Aware Learning for Neural Headline Editing
Authors: Qingyang Wu, Lei Li, Hao Zhou, Ying Zeng, Zhou Yu9282-9289
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that our method significantly improves the quality of headline editing comparing against previous methods. |
| Researcher Affiliation | Collaboration | 1University of California, Davis, 2Byte Dance, {wilwu, joyu}@ucdavis.edu, {lileilab,zhouhao.nlp,zengying.ss}@bytedance.com |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We have released the pre-trained Chinese-GPT model. 1 http://https://github.com/qywu/Chinese-GPT |
| Open Datasets | No | The paper describes collecting the Professional Headline Editing Dataset (PHED) and a Large Scale Chinese Corpus for NLP, providing a link for the latter: 'We collect our corpus from Large Scale Chinese Corpus for NLP 2. http://github.com/brightmart/nlp chinese corpus'. However, the PHED dataset, which is central to their main task, is described as 'constructed' by the authors, and no concrete access information (link, DOI, specific citation for access) is provided for it. |
| Dataset Splits | No | The paper mentions stopping pre-training based on 'validation perplexity' and selecting samples 'from the test set (1,500 samples in total)', but it does not provide specific training/validation/test split percentages or explicit counts for all splits of the PHED dataset. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used, such as GPU/CPU models or specific cloud instances. |
| Software Dependencies | No | The paper mentions using Transformer architecture and BERT, but does not provide specific version numbers for any software libraries or dependencies. |
| Experiment Setup | Yes | We conduct hyper-parameters search for finding the best α and β for SIA. ... We set SIA s α = 0.2 and β = 40.0. ... during training the batches will be fed in the same order. ... During inference, we apply beam search decoding with beam size 10 for all models. We add the length normalization technique (Wu et al. 2016). The temperature is set to be 1.0 as it yields the best result. |