reproducibilityindex.ai

Reasoning in Dialog: Improving Response Generation by Context Reading Comprehension

Authors: Xiuying Chen, Zhi Cui, Jiayi Zhang, Chen Wei, Jianwei Cui, Bin Wang, Dongyan Zhao, Rui Yan12683-12691

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted on this dataset, and the results show that the proposed model brings substantial improvements over several strong baselines on both tasks.
Researcher Affiliation	Collaboration	Xiuying Chen1,2, Zhi Cui3, Jiayi Zhang3, Chen Wei3, Jianwei Cui3, Bin Wang3, Dongyan Zhao1,2, and Rui Yan4,5 1Wangxuan Institute of Computer Technology, Peking University,Beijing,China 2Center for Data Science, AAIS, Peking University,Beijing,China 3Xiaomi AI Lab 4Gaoling School of Artiﬁcial Intelligence, Renmin University of China 5Beijing Academy of Artiﬁcial Intelligence
Pseudocode	No	The paper provides detailed descriptions of the model architecture and mathematical equations but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper states: 'We release our large-scale dataset for further research1. 1https://github.com/yingtaomj/Reasoning-in-Dialog'. This explicitly mentions the release of the dataset, but not the source code for the methodology.
Open Datasets	Yes	Hence, we ﬁrst propose a dialog reading comprehension dataset (DRCD). ... We release our large-scale dataset for further research1. 1https://github.com/yingtaomj/Reasoning-in-Dialog.
Dataset Splits	Yes	We randomly split the dataset with question-answer pair to 113,116 training, 3,000 validation, and 3,000 test cases.
Hardware Specification	Yes	We implement our experiments in Tensor Flow (Abadi et al. 2016) on an NVIDIA GTX 1080 Ti GPU.
Software Dependencies	No	The paper mentions implementing experiments in 'Tensor Flow (Abadi et al. 2016)' but does not specify a version number for TensorFlow or any other software dependencies.
Experiment Setup	Yes	The word embedding dimension is set to 128 and the number of hidden units is 256. We initialize all of the parameters randomly using a Gaussian distribution. The batch size is set to 16, and we limit the vocabulary size to 50K. We use Adagrad optimizer (Duchi, Hazan, and Singer 2010) as our optimizing algorithm. We also apply gradient clipping (Pascanu, Mikolov, and Bengio 2013) with a range of [ 2, 2] during training. During the inference stage, the checkpoint with smallest validation loss is chosen and the beam-search size is set to 4 for all methods.