reproducibilityindex.ai

DialogWAE: Multimodal Response Generation with Conditional Wasserstein Auto-Encoder

Authors: Xiaodong Gu, Kyunghyun Cho, Jung-Woo Ha, Sunghun Kim

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on two popular datasets show that Dialog WAE outperforms the state-of-the-art approaches in generating more coherent, informative and diverse responses.
Researcher Affiliation	Collaboration	Hong Kong University of Science and Technology, New York Universidy, Clova AI Research, NAVER
Pseudocode	Yes	Algorithm 1: Dialog WAE Training (UEnc: utterance encoder; CEnc: context encoder; Rec Net: recognition network; Pri Net: prior network; Dec: decoder) K=3, ncritic=5 in all experiments
Open Source Code	No	The paper does not provide an explicit statement or link to the open-source code for the described methodology.
Open Datasets	Yes	We evaluate our model on two dialogue datasets, Dailydialog (Li et al., 2017b) and Switchboard (Godfrey and Holliman, 1997), which have been widely used in recent studies (Shen et al., 2018; Zhao et al., 2017).
Dataset Splits	Yes	The datasets are separated into training, validation, and test sets with the same ratios as in the baseline papers, that is, 2316:60:62 for Switchboard (Zhao et al., 2017) and 10:1:1 for Dailydialog (Shen et al., 2018), respectively.
Hardware Specification	No	The paper mentions that models are 'ﬁne-tuned with NAVER Smart Machine Learning (NSML) platform', but does not specify any hardware details like CPU, GPU models, or memory for the experimental setup.
Software Dependencies	Yes	All the models are implemented with Pytorch 0.4.03, and ﬁne-tuned with NAVER Smart Machine Learning (NSML) platform (Sung et al., 2017; Kim et al., 2018).
Experiment Setup	Yes	The models are trained with mini-batches containing 32 examples each in an end-to-end manner. In the AE phase, the models are trained by SGD with an initial learning rate of 1.0 and gradient clipping at 1 (Pascanu et al., 2013). We decay the learning rate by 40% every 10th epoch. In the GAN phase, the models are updated using RMSprop (Tieleman and Hinton) with ﬁxed learning rates of 5 10 5 and 1 10 5 for the generator and the discriminator, respectively. We tune the hyper-parameters on the validation set and measure the performance on the test set.