reproducibilityindex.ai

Learning to Select Knowledge for Response Generation in Dialog Systems

Authors: Rongzhong Lian, Min Xie, Fan Wang, Jinhua Peng, Hua Wu

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on both automatic and human evaluation verify the superiority of our model over previous baselines. ... We conducted experiments on two recently created datasets, namely the Persona-chat dataset [Zhang et al., 2018] and the Wizard-of-Wikipedia dataset [Dinan et al., 2018].
Researcher Affiliation	Collaboration	Rongzhong Lian1, Min Xie2, Fan Wang1, Jinhua Peng1, Hua Wu1 1Baidu Inc., China 2The Hong Kong University of Science and Technology
Pseudocode	No	The paper describes the model architecture and components using text and mathematical equations, but it does not include any explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Our models and datasets are all available online: https://github.com/ifr2/Post KS.
Open Datasets	Yes	We conducted experiments on two recently created datasets, namely the Persona-chat dataset [Zhang et al., 2018] and the Wizard-of-Wikipedia dataset [Dinan et al., 2018].
Dataset Splits	Yes	There are 151,157 turns (each turn corresponds to an utterance and a response pair) of conversations in Persona-chat, which we divide into 122,499 for train, 14,602 for validation and 14,056 for test. ... From this dataset, 79,925 turns of conversations are obtained and 68,931/3,686/7,308 of them are used for train/validation/test.
Hardware Specification	Yes	We trained our model with at most 20 epochs on a P40 machine.
Software Dependencies	No	The paper mentions 'GloVe' for word embeddings and 'Adam optimizer', but it does not specify version numbers for any software libraries or dependencies (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	Our encoders and decoders have 2-layer GRU structures with 800 hidden states for each layer, but they do not share any parameters. We set the word embedding size to be 300 and initialized it using GloVe [Pennington et al., 2014]. The vocabulary size is 20,000. We used the Adam optimizer with a mini-batch size of 128 and the learning rate is 0.0005. We trained our model with at most 20 epochs on a P40 machine. In the ﬁrst 5 epochs, we minimize the BOW loss only for pre-training the knowledge manager. In the remaining epochs, we minimize over the sum of all losses.