Meta-Learning for Low-resource Natural Language Generation in Task-oriented Dialogue Systems
Authors: Fei Mi, Minlie Huang, Jiyong Zhang, Boi Faltings
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted on a large multidomain dataset (Multi Woz) with diverse linguistic variations. We show that Meta-NLG significantly outperforms other training procedures in various low-resource configurations. |
| Researcher Affiliation | Academia | 1Artificial Intelligence Laboratory, Ecole Polytechnique F ed erale de Lausanne (EPFL) 2Institute for Artificial Intelligence, Beijing National Research Center for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University 3Depart of Automation, Hangzhou Dianzi University |
| Pseudocode | Yes | Algorithm 1 Meta-NLG(fθ, θ0, Ds, α, β) |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We used a recently released large-scale multi-domain dialog dataset (Multi WOZ, [Budzianowski et al., 2018]). |
| Dataset Splits | Yes | 69,607 annotated utterances are used, with 55,026, 7,291, 7,290 for training, validation, and testing respectively. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions using 'Py Torch' but does not specify a version number for PyTorch or any other software dependencies crucial for replication. |
| Experiment Setup | Yes | For Meta-NLG, we set batch size to 50, and α = 0.1 and β = 0.001. A single inner gradient update is used per meta update with Adam [Kingma and Ba, 2014]. The size of a Meta NLG task is set to 400 with 200 samples assigned to DTi and D Ti. The maximum number of epoches is set to 100 during training and fine-tuning, and early-stop is conducted on a small validation set with size 200. We used the default setting of hyperparameters (n layer = 1, hidden size = 100, dropout = 0.25, clip = 0.5, beam width = 5). |