reproducibilityindex.ai

KPT: Keyword-Guided Pre-training for Grounded Dialog Generation

Authors: Qi Zhu, Fei Mi, Zheng Zhang, Yasheng Wang, Yitong Li, Xin Jiang, Qun Liu, Xiaoyan Zhu, Minlie Huang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on various few-shot knowledge-grounded generation tasks, including grounding on dialog acts, knowledge graphs, persona descriptions, and Wikipedia passages. Our comprehensive experiments and analyses demonstrate that KPT consistently outperforms state-of-the-art methods on these tasks with diverse grounding knowledge.
Researcher Affiliation	Collaboration	1Co AI Group, DCST, IAI, BNRIST, Tsinghua University 2Huawei Noah s Ark Lab
Pseudocode	Yes	Algorithm 1: Prepare keyword-guided pre-training data
Open Source Code	No	The paper mentions using 'Conv Lab-3 (Zhu et al. 2022) for dataset loading and model training' which is a third-party toolkit, but does not provide a link or statement for the authors' own implementation code.
Open Datasets	Yes	As shown in Table 1, our pre-training datasets include Daily Dialog (Li et al. 2017), Schema-Guided Dialog (Rastogi et al. 2020), Taskmaster-1/2/3 (Byrne et al. 2019, 2021), Meta LWOZ (Li et al. 2020), DSTC8-Reddit (Lee et al. 2019), and Wiki Dialog (Dai et al. 2022), covering chit-chats, goal-oriented dialogs, and information seeking dialogs.
Dataset Splits	Yes	We randomly split the data into training (70%), validation (15%), and test set (15%). We fine-tune the models until the validation loss does not decrease for 5 consecutive epochs. Models with the lowest validation losses during training are selected as the final models.
Hardware Specification	Yes	We set the batch size per GPU to 64 and use 8/2 Tesla V100 32G GPUs for pre-training/fine-tuning.
Software Dependencies	No	The paper mentions software components like 'T5 (Raffel et al. 2020)', 'GPT-2 Large (Radford et al. 2019)', 'Dialo GPT Large (762M)', and 'Conv Lab-3 (Zhu et al. 2022)'. While specific models/toolkits are named, explicit version numbers for T5, GPT-2, or PyTorch/CUDA are not provided.
Experiment Setup	Yes	We consider two sizes of model: 60M T5-small and 220M T5-base. For both RG and KPT, we pre-train the models for 1 epoch. During pre-training, we set the keyword ratio α to 0.3... We use Adafactor optimizer with a constant learning rate 1e-3 for both pre-training and fine-tuning. We set the batch size per GPU to 64...