CoChat: Enabling Bot and Human Collaboration for Task Completion
Authors: Xufang Luo, Zijia Lin, Yunhong Wang, Zaiqing Nie
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on real-world datasets well demonstrate that Co Chat can relieve most of the human workers workload, and get better user satisfaction rates comparing to other state-of-the-art frameworks. |
| Researcher Affiliation | Collaboration | Xufang Luo, Zijia Lin, Yunhong Wang, Zaiqing Nie Beijing Advanced Innovation Center for Big Data and Brain Computing, Beihang University, Beijing, China Microsoft Research, Beijing, China Alibaba AI Labs, Beijing, China {luoxufang,yhwang}@buaa.edu.cn; zijlin@microsoft.com; zaiqing.nzq@alibaba-inc.com |
| Pseudocode | No | The paper includes mathematical equations but no structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a link to the source code for the Co Chat framework or Mem HRNN model. The only URL provided is for supplementary material containing 'The full list of actions'. |
| Open Datasets | No | We hired human workers and users who are familiar with two realistic tasks, i.e., booking restaurants and booking movie tickets, and collected their dialogs to build two datasets for our experiments. The paper does not provide concrete access information (link, DOI, or explicit statement of public availability) for these datasets. |
| Dataset Splits | No | The paper describes training with the 'first 50 collected dialogs' and then 'subsequent dialogs' for online and reinforcement learning, and using user simulators for training and testing, but it does not explicitly define distinct 'validation' dataset splits. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks). |
| Experiment Setup | Yes | we set the rewards of reinforcement learning as follows: a fixed punishment (i.e., -0.025) for every turn, a large reward (i.e., 1) for successful task completions and a small one (i.e., 0.5) for failures. Here λ is a balancing factor, and we empirically set it as 0.05. where α is a balancing factor empirically set as 0.1 in our experiments, and σ is a smoothing parameter. |