reproducibilityindex.ai

DialogLM: Pre-trained Model for Long Dialogue Understanding and Summarization

Authors: Ming Zhong, Yang Liu, Yichong Xu, Chenguang Zhu, Michael Zeng11765-11773

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on ﬁve datasets of long dialogues, covering tasks of dialogue summarization, abstractive question answering and topic segmentation. Experimentally, we show that our pre-trained model DIALOGLM signiﬁcantly surpasses the state-of-the-art models across datasets and tasks.
Researcher Affiliation	Collaboration	Ming Zhong*1, University of Illinois at Urbana-Champaign 2Microsoft Cognitive Services Research Group mingz5@illinois.edu, {yaliu10, yichong.xu, chezhu, nzeng}@microsoft.com
Pseudocode	No	The paper includes diagrams and describes the steps of its method in prose, but there are no explicitly labeled pseudocode blocks or algorithm listings.
Open Source Code	Yes	Source code and all the pretrained models are available on our Git Hub repository (https: //github.com/microsoft/Dialog LM).
Open Datasets	Yes	Pretraining data is the combination of Media Sum dataset (Zhu et al. 2021) and Open Subtitles Corpus (Lison and Tiedemann 2016) (see Table 2).
Dataset Splits	No	The paper mentions using well-known datasets like AMI, ICSI, QMSum, Forever Dreaming, and TVMega Site, and refers to 'test set' for evaluation. However, it does not provide specific details on training, validation, and test splits (e.g., percentages or sample counts) for any of these datasets.
Hardware Specification	Yes	8 A100 GPUs with 40GB memory are used to complete the experiments in this paper.
Software Dependencies	No	The paper mentions the use of models like UNILM and Transformer, but it does not specify any software libraries, frameworks, or programming languages with their version numbers that were used for the experiments.
Experiment Setup	Yes	To pre-train DIALOGLM, we further train UNILM with the window-based denoising framework for total 200,000 steps on dialogue data, of which 20,000 are warmup steps. We set batch size to 64 and the maximum learning rate to 2e-5.