DialogWAE: Multimodal Response Generation with Conditional Wasserstein Auto-Encoder

Authors: Xiaodong Gu, Kyunghyun Cho, Jung-Woo Ha, Sunghun Kim

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on two popular datasets show that Dialog WAE outperforms the state-of-the-art approaches in generating more coherent, informative and diverse responses.
Researcher Affiliation Collaboration Hong Kong University of Science and Technology, New York Universidy, Clova AI Research, NAVER
Pseudocode Yes Algorithm 1: Dialog WAE Training (UEnc: utterance encoder; CEnc: context encoder; Rec Net: recognition network; Pri Net: prior network; Dec: decoder) K=3, ncritic=5 in all experiments
Open Source Code No The paper does not provide an explicit statement or link to the open-source code for the described methodology.
Open Datasets Yes We evaluate our model on two dialogue datasets, Dailydialog (Li et al., 2017b) and Switchboard (Godfrey and Holliman, 1997), which have been widely used in recent studies (Shen et al., 2018; Zhao et al., 2017).
Dataset Splits Yes The datasets are separated into training, validation, and test sets with the same ratios as in the baseline papers, that is, 2316:60:62 for Switchboard (Zhao et al., 2017) and 10:1:1 for Dailydialog (Shen et al., 2018), respectively.
Hardware Specification No The paper mentions that models are 'fine-tuned with NAVER Smart Machine Learning (NSML) platform', but does not specify any hardware details like CPU, GPU models, or memory for the experimental setup.
Software Dependencies Yes All the models are implemented with Pytorch 0.4.03, and fine-tuned with NAVER Smart Machine Learning (NSML) platform (Sung et al., 2017; Kim et al., 2018).
Experiment Setup Yes The models are trained with mini-batches containing 32 examples each in an end-to-end manner. In the AE phase, the models are trained by SGD with an initial learning rate of 1.0 and gradient clipping at 1 (Pascanu et al., 2013). We decay the learning rate by 40% every 10th epoch. In the GAN phase, the models are updated using RMSprop (Tieleman and Hinton) with fixed learning rates of 5 10 5 and 1 10 5 for the generator and the discriminator, respectively. We tune the hyper-parameters on the validation set and measure the performance on the test set.