Building Task-Oriented Dialogue Systems for Online Shopping

Authors: Zhao Yan, Nan Duan, Peng Chen, Ming Zhou, Jianshe Zhou, Zhoujun Li

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Interesting and insightful observations are shown in the experimental part, based on the analysis of human-bot conversation log. Several current challenges are also pointed out as our future directions. We evaluate product category detection by two settings. In Off-line setting, we first transform each q, urlp pair (6,315,233 in total) to a q, Cp pair... Evaluation results are shown in Table 6...
Researcher Affiliation Collaboration Zhao Yan , Nan Duan , Peng Chen , Ming Zhou , Jianshe Zhou and Zhoujun Li State Key Lab of Software Development Environment, Beihang University, Beijing, China Microsoft Research, Beijing, China Microsoft Xiaoice Team, Beijing, China BAICIT, Capital Normal University, Beijing, China {yanzhao, lizj}@buaa.edu.cn {nanduan, peche, mingzhou}@microsoft.com {zhoujs}@cnu.edu.cn
Pseudocode Yes Algorithm 1: Intent Phrase Mining... Algorithm 2: Product Attribute Extraction... Algorithm 3: Global Search
Open Source Code No The paper does not provide a direct link to a source code repository or an explicit statement about the release of the code for the methodology described in this paper. It mentions using existing resources and methodologies but does not offer the authors' implementation code.
Open Datasets Yes We crawl raw questions from Baidu Zhidao1. After filtering the full question set based on the product knowledge base, there are 3,146,063 questions left in QD.
Dataset Splits Yes We evaluate product category detection by two settings. In Off-line setting, we first transform each q, urlp pair (6,315,233 in total) to a q, Cp pair by finding p s category from product knowledge base. Next, we split q, Cp pairs into a training set (8/10), a dev set (1/10) and a test set (1/10).
Hardware Specification No The paper does not specify any hardware details such as CPU, GPU models, memory, or cloud computing instances used for conducting the experiments.
Software Dependencies No The paper describes the use of a 'CNN-based approach' and 'stochastic gradient descent (SGD)' for training but does not provide specific software dependencies like library names with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup Yes Input Layer. Traditionally, each word after tokenization can be represented by a one-hot word vector, whose dimensionality equals to the word size... We then obtain representation of the tth word-n-gram in an utterance Q by concatenating the character vectors of each word as: lt = [w T t d, ..., w T t , ..., w T t+d]T , where wt denotes the tth word representation, and n = 2d + 1 denotes the contextual window size, which is set to 3. The model is trained by maximizing the likelihood of the correctly associated product categories given training utterances, using stochastic gradient descent (SGD). The minimum support threshold is set to 5 for Frequent Phrase Mining, and the topic size is set to 1,000 for Phrase LDA.