Local Explanation of Dialogue Response Generation

Authors: Yi-Lin Tuan, Connor Pryor, Wenhu Chen, Lise Getoor, William Yang Wang

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirically, our results show that our method consistently improves other widely used methods on proposed automatic and human evaluation metrics for this new task by 4.4-12.8%.
Researcher Affiliation Academia 1 University of California, Santa Barbara 2 University of California, Santa Cruz {ytuan, wenhuchen, william}@cs.ucsb.edu {cfpryor, getoor}@ucsc.edu
Pseudocode Yes Algorithm 1: LOCAL EXPLANATION OF RESPONSE GENERATION
Open Source Code Yes Our code is available at https://github.com/Pascalson/LERG.
Open Datasets Yes We specifically select and study a popular conversational dataset called Daily Dialog [25] because its dialogues are based on daily topics and have less uninformative responses.
Dataset Splits No No explicit training, validation, or test dataset split percentages or counts are provided in the main text. The paper mentions training until loss converges and reports test perplexities, but without detailing the splits (e.g., "We train until the loss converges on both models and achieve fairly low test perplexities compared to [25]: 12.35 and 11.83 respectively.").
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory specifications) used for running experiments are mentioned in the main text.
Software Dependencies No No specific software dependencies with version numbers are provided in the main text for reproducibility.
Experiment Setup Yes We fine-tune a GPT-based language model [33, 47] and a Dialo GPT [55] on Daily Dialog by minimizing the following loss function: j log Pθ(yj|x, y<j) , (17)