reproducibilityindex.ai

Improving Variational Encoder-Decoders in Dialogue Generation

Authors: Xiaoyu Shen, Hui Su, Shuzi Niu, Vera Demberg

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare our model with current popular models and the experiment demonstrates substantial improvement in both metric-based and human evaluations.
Researcher Affiliation	Academia	1Max Planck Institute Informatics, Germany 2Dept of Math & CS and Lang Sci & Tech, Saarland University, Germany 3Saarland Informatics Campus, Germany 4Institute of Software, University of Chinese Academy of Science, China
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using TensorFlow, which is open-sourced, but does not provide a link or statement that their specific implementation code is open-sourced or available.
Open Datasets	Yes	We conduct our experiments on two dialogue datasets: Dailydialog (Li et al. 2017) and Switchboard (Godfrey and Holliman ).
Dataset Splits	Yes	These two datasets are randomly separated into training/validation/test sets with the ratio of 10:1:1.
Hardware Specification	No	The paper mentions dialogs being "fed into the GPU memory" but does not specify any particular GPU model, CPU, or other hardware components used for the experiments.
Software Dependencies	No	We implemented all the models with the open-sourced Python library Tensorﬂow (Abadi et al. 2016) and optimized using the Adam optimizer (Kingma and Ba 2014). No specific version numbers for TensorFlow or Python are provided.
Experiment Setup	Yes	The vocabulary size was set as 20,000 and all the OOV words were mapped to a special token <unk>. We set word embeddings to size of 300... The ﬁrst, second-layer encoder and decoder RNN in the following experiments are single-layer GRU with 512, 1024 and 512 hidden neurons. The dimension of latent variables is set to 512. The batch size is 128 and we ﬁx the learning rate as 0.0002 for all models. Our framework is trained epochwise by alternatively training the CVAE and DAE part. The probability estimators for VAE are 2-layer feedforward neural networks. At test time, we output the most likely responses using beam search with beam size set to 5... We set an α value 5 for our collaborative model (CO) and set the scheduled sampling (SS) weight k = 2500 or 5000 for Dailydialog or Switchboard.