Multi-View Feature Representation for Dialogue Generation with Bidirectional Distillation

Authors: Shaoxiong Feng, Xuancheng Ren, Kan Li, Xu Sun12812-12820

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on two high-quality open-domain dialogue datasets, Daily Dialog and Persona Chat, compared with state-of-the-art methods, and provide extensive analysis to examine the effect of the proposed method.
Researcher Affiliation Academia Shaoxiong Feng,1 Xuancheng Ren,2 Kan Li,1 Xu Sun2,3 1School of Computer Science & Technology, Beijing Institute of Technology 2MOE Key Laboratory of Computational Linguistics, School of EECS, Peking University 3Center for Data Science, Peking University {shaoxiongfeng, likan}@bit.edu.cn, {renxc, xusun}@pku.edu.cn
Pseudocode No The paper describes its methods using textual explanations and mathematical formulas, but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statements about releasing open-source code for the described methodology, nor does it provide links to any code repositories or mention supplementary materials containing code.
Open Datasets Yes We adopt two commonly-used dialogue datasets: Daily Dialog (Li et al. 2017b) and Persona Chat (Zhang et al. 2018a).
Dataset Splits Yes Finally, the processed dataset contains 50K, 4.5K, and 4.3K pairs for training, validation, and testing, respectively. (Daily Dialog) [...] The processed dataset contains 106K, 13K, and 12.5K pairs for training, validation, and testing. (Persona Chat)
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, processor types, or memory amounts used for running the experiments.
Software Dependencies No The paper mentions the use of 'Adam optimizer (Kingma and Ba 2015)' but does not provide specific version numbers for any software dependencies, libraries, or frameworks used for the experiments.
Experiment Setup Yes We set the embedding size to 500, the vocabulary size for both Daily Dialog and Persona Chat to 20K. The dropout probability and the temperature T are 0.1 and 3, respectively. We use Adam optimizer (Kingma and Ba 2015), with a learning rate of 0.0001, gradient clipping at 5.0, and a mini-batch size of 64. [...] We set the number of students to 6 for DML and MRBD. The imitation probability in MRBD is 0.5. The training set is randomly divided into six non-overlapping subsets with the same number of pairs.