Progressive Open-Domain Response Generation with Multiple Controllable Attributes

Authors: Haiqin Yang, Xiaoyuan Yao, Yiqun Duan, Jianping Shen, Jie Zhong, Kun Zhang

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we conduct extensive evaluations to show that PHED significantly outperforms the state-of-the-art neural generation models and produces more diverse responses as expected. The contribution of our work is threefold: (3) empirical evaluations clearly demonstrating the effectiveness of PHED.
Researcher Affiliation Collaboration Haiqin Yang1 , Xiaoyuan Yao1 , Yiqun Duan1 , Jianping Shen1 , Jie Zhong1 and Kun Zhang2 1Ping An Life Insurance Company of China 2Carnegie Mellon University
Pseudocode No The paper does not include a section explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured pseudocode blocks.
Open Source Code Yes Our implementation is in Py Torch 1. 1https://www.dropbox.com/s/1376kmhvuaxqe5h/PHED.zip?dl=0
Open Datasets Yes The data is the short-text conversation dataset (STC) [Shang et al., 2015], collected from Sina Weibo, a Chinese social platform.
Dataset Splits Yes After setting the maximum number of characters in a response to 30, we obtain around 3.9 million dialog pairs and split them into the set of training, validation, and test with the ratio of 90%, 5%, and 5%, respectively.
Hardware Specification Yes Under the above settings, we train PHED 10 epochs at each stage on a Tesla V100 GPU and cost about 51 hours.
Software Dependencies No The paper mentions 'Py Torch' as the implementation framework but does not specify a version number or other software dependencies with their versions.
Experiment Setup Yes For each Transformer block, we set the number of self-attention heads to 8 and the hidden size (H) to 512. [...] trained by ADAM with the learning rate 0.0001 and the batch size of 32. In the inference, we set the beam search size to 5.