Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation

Authors: Xiaocheng Feng, Yawei Sun, Bing Qin, Heng Gong, Yibo Sun, Wei Bi, XiaoJiang Liu, Ting Liu7716-7723

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical results show superiority of our approaches over competitive methods, and the models also yield a new state-of-the-art result on a sentence-level dataset. To verify the effectiveness of our text manipulation approaches, we first build a large unsupervised document-level text manipulation dataset, which is extracted from an NBA game report corpus (Wiseman, Shieber, and Rush 2017). Experiments of different methods on this new corpus show that our full model achieves 35.02 in Style BLEU and 39.47 Fscore in Content Selection, substantially better than baseline methods.
Researcher Affiliation Collaboration Xiaocheng Feng,1 Yawei Sun,1 Bing Qin,1 Heng Gong,1 Yibo Sun,1 Wei Bi,2 Xiaojiang Liu,2 Ting Liu1 1Harbin Institute of Technology, Harbin, China 2Tencent AI Lab, Shenzhen, China
Pseudocode No The paper describes the model architecture and components in text and diagrams but does not include any pseudocode or algorithm blocks.
Open Source Code Yes Our code and data are available at: https://github.com/ syw1996/SCIR-TG-Data2text-Bi-Aspect
Open Datasets Yes To verify the effectiveness of our text manipulation approaches, we first build a large unsupervised document-level text manipulation dataset, which is extracted from an NBA game report corpus (Wiseman, Shieber, and Rush 2017). Our code and data are available at: https://github.com/ syw1996/SCIR-TG-Data2text-Bi-Aspect
Dataset Splits Yes Train(D/S) Dev(D/S) Test(D/S) #Instances 3371/31,751 722/6,833 728/6,999 Avg Ref Length 335.55/25.90 341.17/25.82 346.83/25.99 #Data Types 37/34 37/34 37/34 Avg Input Record Length 606/5 606/5 606/5 Avg Output Record Length 38.05/4.88 37.80/4.85 31.32/4.94 Table 1: Document-level/Sentence-level Data Statistics.
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions using LSTMs and Adam optimizer, but does not provide specific version numbers for any software dependencies like Python, PyTorch, or TensorFlow.
Experiment Setup Yes We set the hyperparameters empirically based on multiple tries with different settings. We find the following setting to be the best. The dimension of word/feature embedding, encoder hidden state, and decoder hidden state are all set to be 600. We apply dropout at a rate of 0.3. Our training process consists of three parts. In the first, we set λ1 = 0 and λ2 = 1 in Eq. 7 and pre-train the model to convergence. We then set λ1 = 0.5 and λ2 = 0.5 for the next stage training. Finally, we set λ1 = 0.4 and λ2 = 0.5 for full training. Adam is used for parameter optimization with an initial learning rate of 0.001 and decaying rate of 0.97. During testing, we use beam search with beam size of 5. The minimum decoding length is set to be 150 and maximum decoding length is set to be 850.