reproducibilityindex.ai

Variational Learning for Unsupervised Knowledge Grounded Dialogs

Authors: Mayank Mishra, Dhiraj Madan, Gaurav Pandey, Danish Contractor

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Using a collection of three publicly available openconversation datasets, we demonstrate how the posterior distribution, that has information from the ground-truth response, allows for a better approximation of the objective function during training. To overcome the challenges associated with sampling over a large knowledge collection, we develop an efficient approach to approximate the ELBO. To the best of our knowledge we are the first to apply variational training for open-scale unsupervised knowledge grounded dialog systems.
Researcher Affiliation	Industry	Mayank Mishra , Dhiraj Madan , Gaurav Pandey and Danish Contractor IBM Research AI mayank.mishra1@ibm.com, {dmadan07, gpandey1}@in.ibm.com, danish.contractor@ibm.com
Pseudocode	No	The paper describes the architecture and training process in text but does not include any pseudocode or algorithm blocks.
Open Source Code	Yes	We provide the code and the supplementary material at https: //github.com/mayank31398/VRAG and https://arxiv.org/abs/2112. 00653 respectively.
Open Datasets	Yes	Using a collection of three publicly available openconversation datasets, OR-Qu AC [Qu et al., 2020], DSTC9 [Kim et al., 2020b], Do QA [Campos et al., 2020]
Dataset Splits	No	The paper mentions 'validation sets' and 'early stopping with patience = 5 on recall of the validation sets' but does not specify the size, percentage, or method of creating the validation split.
Hardware Specification	No	The paper does not explicitly describe the hardware (e.g., specific GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions software components like 'BERT model', 'GPT2', 'DPR-Multiset model', and 'Adam W Optimizer' but does not specify their version numbers for reproducibility.
Experiment Setup	Yes	We initialize our document-prior (for both RAG and VRAG) and document-posterior (for VRAG) networks with the pretrained DPR-Multiset model5 pre-trained using data from the Natural Questions [Kwiatkowski et al., 2019], Trivia QA [Joshi et al., 2017] etc. During the training of the model, it can be difficult to rebuild the document index after every change to the document representation parameters in f, therefore similar to Lewis et al., the parameters in f are kept constant. We used early stopping with patience = 5 on recall of the validation sets to prevent overfitting of models. The loss was optimized using Adam W Optimizer [Loshchilov and Hutter, 2017]. We also found it useful to continue training the response-likelihood for both RAG and VRAG after the joint training is complete.