Learning to Select Bi-Aspect Information for Document-Scale Text Content Manipulation
Authors: Xiaocheng Feng, Yawei Sun, Bing Qin, Heng Gong, Yibo Sun, Wei Bi, XiaoJiang Liu, Ting Liu7716-7723
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show superiority of our approaches over competitive methods, and the models also yield a new state-of-the-art result on a sentence-level dataset. To verify the effectiveness of our text manipulation approaches, we first build a large unsupervised document-level text manipulation dataset, which is extracted from an NBA game report corpus (Wiseman, Shieber, and Rush 2017). Experiments of different methods on this new corpus show that our full model achieves 35.02 in Style BLEU and 39.47 Fscore in Content Selection, substantially better than baseline methods. |
| Researcher Affiliation | Collaboration | Xiaocheng Feng,1 Yawei Sun,1 Bing Qin,1 Heng Gong,1 Yibo Sun,1 Wei Bi,2 Xiaojiang Liu,2 Ting Liu1 1Harbin Institute of Technology, Harbin, China 2Tencent AI Lab, Shenzhen, China |
| Pseudocode | No | The paper describes the model architecture and components in text and diagrams but does not include any pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code and data are available at: https://github.com/ syw1996/SCIR-TG-Data2text-Bi-Aspect |
| Open Datasets | Yes | To verify the effectiveness of our text manipulation approaches, we first build a large unsupervised document-level text manipulation dataset, which is extracted from an NBA game report corpus (Wiseman, Shieber, and Rush 2017). Our code and data are available at: https://github.com/ syw1996/SCIR-TG-Data2text-Bi-Aspect |
| Dataset Splits | Yes | Train(D/S) Dev(D/S) Test(D/S) #Instances 3371/31,751 722/6,833 728/6,999 Avg Ref Length 335.55/25.90 341.17/25.82 346.83/25.99 #Data Types 37/34 37/34 37/34 Avg Input Record Length 606/5 606/5 606/5 Avg Output Record Length 38.05/4.88 37.80/4.85 31.32/4.94 Table 1: Document-level/Sentence-level Data Statistics. |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using LSTMs and Adam optimizer, but does not provide specific version numbers for any software dependencies like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | We set the hyperparameters empirically based on multiple tries with different settings. We find the following setting to be the best. The dimension of word/feature embedding, encoder hidden state, and decoder hidden state are all set to be 600. We apply dropout at a rate of 0.3. Our training process consists of three parts. In the first, we set λ1 = 0 and λ2 = 1 in Eq. 7 and pre-train the model to convergence. We then set λ1 = 0.5 and λ2 = 0.5 for the next stage training. Finally, we set λ1 = 0.4 and λ2 = 0.5 for full training. Adam is used for parameter optimization with an initial learning rate of 0.001 and decaying rate of 0.97. During testing, we use beam search with beam size of 5. The minimum decoding length is set to be 150 and maximum decoding length is set to be 850. |