Dialog State Tracking with Reinforced Data Augmentation
Authors: Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu9474-9481
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the Wo Z and Multi Wo Z (restaurant) datasets demonstrate that the proposed framework significantly improves the performance over the state-of-the-art models, especially with limited training data. |
| Researcher Affiliation | Industry | Yichun Yin, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu Huawei Noah s Ark Lab {yinyichun, shang.lifeng, jiang.xin, chen.xiao2, qun.liu}@huawei.com |
| Pseudocode | Yes | Algorithm 1 The Reinforced Data Augmentation Input: Pre-trained Tracker with parameters θr; the randomly initialized Generator with parameters θπ; Output: Re-trained Tracker 1: Store θπ 2: for l = 1 L do 3: Re-initialize the Generator with θπ 4: for n = 1 N do 5: Re-initialize the Tracker with θr 6: Sample a bag B 7: for j = 1 M do 8: Sample a new bag B j 9: end for 10: Compute bag reward with Eq. 5 11: Compute instance reward with Eq. 6 12: Update θπ by the gradients in Eq.4 13: end for 14: Obtain new data D by the Generator 15: Re-train the Tracker on D + D , update θr 16: end for 17: Save the Tracker with θr which performs best on the validation set among the L epochs |
| Open Source Code | No | The paper mentions implementing the model using PyTorch and provides a link to pytorch.org in a footnote, but it does not state that the authors' own source code for the described methodology is publicly available. |
| Open Datasets | Yes | We use Wo Z (Wen et al. 2017) and Multi Wo Z (Budzianowski et al. 2018) to evaluate the proposed framework on the task of dialog state tracking4. |
| Dataset Splits | No | The paper mentions using a "validation set" and performing "sub-sampling experiments with ... different ratios [10%, 20%, 50%] of the training set" but does not specify the explicit train/validation/test splits (e.g., percentages or counts) for the main datasets used in the core experiments. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments. |
| Software Dependencies | No | The paper states: "We implement the proposed model using Py Torch5." (Footnote 5 points to https://pytorch.org/). However, it only mentions PyTorch without a specific version number, and no other software dependencies with version numbers are listed. |
| Experiment Setup | Yes | All hyper-parameters of our model are tuned based on the validation set. [...] The epoch number of the alternate learning L, the epoch number of the generator learning N and the sampling times M for each bag are set to 5, 200 and 2 respectively. We set the dimensions of all hidden states to 200 in both the Tracker and the Generator, and set the head number of multihead Self-Attention to 4 in the Tracker. All learnable parameters are optimized by the ADAM optimizer with a learning rate of 1e-3. The batch size is set to 16 in the Tracker learning, and the bag size in the Generator learning is set to 25. To avoid over-fitting, we apply dropout to the layer of word embeddings with a rate of 0.2. [...] The newly augmented dataset is n times the size of the original training data (n = 5 for the Woz and n = 3 for Multi Woz). At each iteration, we randomly sample a subset of the augmented data to train the Tracker. The sampling ratios are 0.4 for Woz and 0.3 for Muti Woz. |