Progressive Open-Domain Response Generation with Multiple Controllable Attributes
Authors: Haiqin Yang, Xiaoyuan Yao, Yiqun Duan, Jianping Shen, Jie Zhong, Kun Zhang
IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we conduct extensive evaluations to show that PHED significantly outperforms the state-of-the-art neural generation models and produces more diverse responses as expected. The contribution of our work is threefold: (3) empirical evaluations clearly demonstrating the effectiveness of PHED. |
| Researcher Affiliation | Collaboration | Haiqin Yang1 , Xiaoyuan Yao1 , Yiqun Duan1 , Jianping Shen1 , Jie Zhong1 and Kun Zhang2 1Ping An Life Insurance Company of China 2Carnegie Mellon University |
| Pseudocode | No | The paper does not include a section explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured pseudocode blocks. |
| Open Source Code | Yes | Our implementation is in Py Torch 1. 1https://www.dropbox.com/s/1376kmhvuaxqe5h/PHED.zip?dl=0 |
| Open Datasets | Yes | The data is the short-text conversation dataset (STC) [Shang et al., 2015], collected from Sina Weibo, a Chinese social platform. |
| Dataset Splits | Yes | After setting the maximum number of characters in a response to 30, we obtain around 3.9 million dialog pairs and split them into the set of training, validation, and test with the ratio of 90%, 5%, and 5%, respectively. |
| Hardware Specification | Yes | Under the above settings, we train PHED 10 epochs at each stage on a Tesla V100 GPU and cost about 51 hours. |
| Software Dependencies | No | The paper mentions 'Py Torch' as the implementation framework but does not specify a version number or other software dependencies with their versions. |
| Experiment Setup | Yes | For each Transformer block, we set the number of self-attention heads to 8 and the hidden size (H) to 512. [...] trained by ADAM with the learning rate 0.0001 and the batch size of 32. In the inference, we set the beam search size to 5. |