Improving Variational Encoder-Decoders in Dialogue Generation

Authors: Xiaoyu Shen, Hui Su, Shuzi Niu, Vera Demberg

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare our model with current popular models and the experiment demonstrates substantial improvement in both metric-based and human evaluations.
Researcher Affiliation Academia 1Max Planck Institute Informatics, Germany 2Dept of Math & CS and Lang Sci & Tech, Saarland University, Germany 3Saarland Informatics Campus, Germany 4Institute of Software, University of Chinese Academy of Science, China
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper mentions using TensorFlow, which is open-sourced, but does not provide a link or statement that their specific implementation code is open-sourced or available.
Open Datasets Yes We conduct our experiments on two dialogue datasets: Dailydialog (Li et al. 2017) and Switchboard (Godfrey and Holliman ).
Dataset Splits Yes These two datasets are randomly separated into training/validation/test sets with the ratio of 10:1:1.
Hardware Specification No The paper mentions dialogs being "fed into the GPU memory" but does not specify any particular GPU model, CPU, or other hardware components used for the experiments.
Software Dependencies No We implemented all the models with the open-sourced Python library Tensorflow (Abadi et al. 2016) and optimized using the Adam optimizer (Kingma and Ba 2014). No specific version numbers for TensorFlow or Python are provided.
Experiment Setup Yes The vocabulary size was set as 20,000 and all the OOV words were mapped to a special token <unk>. We set word embeddings to size of 300... The first, second-layer encoder and decoder RNN in the following experiments are single-layer GRU with 512, 1024 and 512 hidden neurons. The dimension of latent variables is set to 512. The batch size is 128 and we fix the learning rate as 0.0002 for all models. Our framework is trained epochwise by alternatively training the CVAE and DAE part. The probability estimators for VAE are 2-layer feedforward neural networks. At test time, we output the most likely responses using beam search with beam size set to 5... We set an α value 5 for our collaborative model (CO) and set the scheduled sampling (SS) weight k = 2500 or 5000 for Dailydialog or Switchboard.