Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Towards Scalable Multi-Domain Conversational Agents: The Schema-Guided Dialogue Dataset
Authors: Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan8689-8696
AAAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our model on public datasets WOZ2.0 and Multi WOZ 2.1 (Eric et al. 2019). As results in Table 4 show, our model performs competitively on these datasets. In these experiments, we omit the use of fuzzy matching scores and use exact match while calculating the goal accuracies to keep our numbers comparable with other works. |
| Researcher Affiliation | Industry | Abhinav Rastogi, Xiaoxue Zang, Srinivas Sunkara, Raghav Gupta, Pranav Khaitan Google Research, Mountain View, California, USA EMAIL |
| Pseudocode | No | The paper details the model components and their mathematical formulations but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our model code is available at github.com/google-research/google-research/tree/master/schema_guided_dst |
| Open Datasets | Yes | In this work, we introduce the the Schema-Guided Dialogue (SGD) dataset, containing over 16k multi-domain conversations spanning 16 domains. The dataset has been released at github.com/google-research-datasets/dstc8-schema-guided-dialogue |
| Dataset Splits | Yes | The 20 domains present across the train, dev and test splits are listed in Table 2. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'BERT' as a pre-trained model but does not specify its version or other software dependencies with their respective version numbers. |
| Experiment Setup | No | The paper describes the model architecture and its components but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size), optimizer settings, or training schedules. |