reproducibilityindex.ai

Local Explanation of Dialogue Response Generation

Authors: Yi-Lin Tuan, Connor Pryor, Wenhu Chen, Lise Getoor, William Yang Wang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, our results show that our method consistently improves other widely used methods on proposed automatic and human evaluation metrics for this new task by 4.4-12.8%.
Researcher Affiliation	Academia	1 University of California, Santa Barbara 2 University of California, Santa Cruz {ytuan, wenhuchen, william}@cs.ucsb.edu {cfpryor, getoor}@ucsc.edu
Pseudocode	Yes	Algorithm 1: LOCAL EXPLANATION OF RESPONSE GENERATION
Open Source Code	Yes	Our code is available at https://github.com/Pascalson/LERG.
Open Datasets	Yes	We speciﬁcally select and study a popular conversational dataset called Daily Dialog [25] because its dialogues are based on daily topics and have less uninformative responses.
Dataset Splits	No	No explicit training, validation, or test dataset split percentages or counts are provided in the main text. The paper mentions training until loss converges and reports test perplexities, but without detailing the splits (e.g., "We train until the loss converges on both models and achieve fairly low test perplexities compared to [25]: 12.35 and 11.83 respectively.").
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments are mentioned in the main text.
Software Dependencies	No	No specific software dependencies with version numbers are provided in the main text for reproducibility.
Experiment Setup	Yes	We ﬁne-tune a GPT-based language model [33, 47] and a Dialo GPT [55] on Daily Dialog by minimizing the following loss function: j log Pθ(yj\|x, y<j) , (17)