Stylized Dialogue Response Generation Using Stylized Unpaired Texts

Authors: Yinhe Zheng, Zikai Chen, Rongsheng Zhang, Shilei Huang, Xiaoxi Mao, Minlie Huang14558-14567

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Automatic and manual evaluations on two datasets demonstrate that our method outperforms competitive baselines in producing coherent and style-intensive dialogue responses.
Researcher Affiliation Collaboration 1 Department of Computer Science and Technology, Institute for Artifical Intelligence, State Key Lab of Intelligent Technology and Systems, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing, China. 2 Samsung Research China Beijing (SRC-B), Beijing, China. 3 Fuxi AI Lab, Net Ease Inc., Hangzhou, China.
Pseudocode Yes Algorithm 1 Joint training process Input: M unpaired texts: Ds={ti}M i=1 in style S1, N dialogue pairs Dp={ xi, yi }N i=1 in style S0. Output: Stylized dialogue model 1: Init the stylized and inverse dialogue model e, d, ˆe, ˆd 2: while not converge do 3: Sample nd dialogue pairs Db p = { xi, yi }nd i=1 Dp 4: Train e and d by optimizing Lp2r (Eq. 6) on Db p 5: Train ˆe and ˆd by optimizing Lr2p (Eq. 7) on Db p 6: if Current Step > Nf then 7: Dpp empty set. 8: Sample ns stylized texts Db s = {ti}ns i=1 Ds 9: for each ti Db s do 10: Decode m posts {x ij}m j=1 from p ˆd(x|ˆe(ti)) 11: Dpp Dpp S{ x ij, ti }m j=1 12: end for 13: Train e and d by optimizing Linv (Eq. 8) on Dpp 14: end if 15: end while
Open Source Code No The WDJN dataset will be released for public use.
Open Datasets Yes We collected 300K Weibo Dialogues (style S0) as Dp and sampled 95.1K stylized unpaired texts that are wrapped in quotation marks in Jinyong s Novels (style S1) as Ds. The WDJN dataset will be released for public use. TCFC (Wu, Wang, and Liu 2020): This dataset focuses on the formality in English writing. We sampled 217.2K informal dialogue pairs (style S0) as Dp and 500.0K formal texts (style S1) as Ds from the original dataset, and used the test data in the original dataset as our test set Dt, which contains 1,956 manually-crafted dialogue pairs (978 informal pairs and 978 formal pairs).
Dataset Splits No The paper describes training and test sets in Table 1 and in the text, but does not explicitly mention or quantify a validation set split for hyperparameter tuning or early stopping.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions using "pre-trained CDial-GPT" and "Dialo GPT" models, but does not provide specific version numbers for software dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes The top-K sampling process in Algorithm 1 employs a K = 20 and beam size of 4 (WDJN) or 2 (TCFC). The value of Nf is set to 300. The training of our model stops after 10 iteration epochs on Dp (WDJN) or after 8,000 steps of updates (TCFC).