Towards Diverse, Relevant and Coherent Open-Domain Dialogue Generation via Hybrid Latent Variables

Authors: Bin Sun, Yitong Li, Fei Mi, Weichao Wang, Yiwei Li, Kan Li

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on two dialogue generation datasets (Daily Dialog and Opensubtitles) show that CHVT is superior to traditional transformer-based variational mechanism w.r.t. diversity, relevance and coherence metrics.
Researcher Affiliation Collaboration Bin Sun1, Yitong Li2,3, Fei Mi2, Weichao Wang2, Yiwei Li1, Kan Li1 1 School of Computer Science & Technology, Beijing Institute of Technology 2 Huawei Noah s Ark Lab 3 Huawei Technologies Ltd. {binsun,liyiwei,likan}@bit.edu.cn, {liyitong3,mifei2,wangweichao9}@huawei.com
Pseudocode No The paper describes methods and training steps but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper mentions 'A full version is available at https://arxiv.org/abs/2212.01145.' but does not explicitly state that the source code for the methodology is provided, nor does it include a direct link to a code repository.
Open Datasets Yes We conduct extensive experiments on Daily Dialog (Li et al. 2017b) and Opensubtitles (Lison and Tiedemann 2016) datasets.
Dataset Splits Yes We collected all dialogue pairs, reduced the repeat pairs, and divided them into training, validation and test sets. Table 1 lists key statistics of our datasets. Daily Dialog # Train 68,066 # Valid 6,820 # Test 6,841. Open Subtitles # Train 200K # Valid 20K # Test 10K.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for running the experiments.
Software Dependencies No The paper does not list specific software dependencies with version numbers.
Experiment Setup No The paper mentions general training techniques like 'KL annealing trick with a large warmup batch size' and 'lambda is the scale factor of KL divergence', but does not provide specific numerical values for hyperparameters such as learning rate, batch size, or epochs needed to reproduce the experiment setup.