Towards Multi-Intent Spoken Language Understanding via Hierarchical Attention and Optimal Transport
Authors: Xuxin Cheng, Zhihong Zhu, Hongxiang Li, Yaowei Li, Xianwei Zhuang, Yuexian Zou
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that our model achieves state-of-the-art performance on two public Multi-Intent SLU datasets, obtaining the 3.4 improvement on Mix ATIS dataset compared to the previous best models in overall accuracy. |
| Researcher Affiliation | Academia | Xuxin Cheng, Zhihong Zhu, Hongxiang Li, Yaowei Li, Xianwei Zhuang, Yuexian Zou* School of ECE, Peking University, China {chengxx, zhihongzhu, lihongxiang, ywl, xwzhuang}@stu.pku.edu.cn, zouyx@pku.edu.cn |
| Pseudocode | No | No structured pseudocode or algorithm blocks explicitly labeled as such. |
| Open Source Code | No | The paper mentions 'https://github.com/LooperXX/AGIF' as a source for datasets but does not provide a specific link or explicit statement for the open-source code of their proposed HAOT framework. |
| Open Datasets | Yes | We conduct all the experiments on two public Multi-Intent SLU datasets1, including Mix ATIS dataset and Mix SNIPS dataset (Qin et al. 2020). Mix ATIS dataset is collected from ATIS (Hemphill, Godfrey, and Doddington 1990) and Mix SNIPS dataset is collected from SNIPS (Coucke et al. 2018). 1https://github.com/LooperXX/AGIF |
| Dataset Splits | Yes | Table 1: Dataset statistics. ... Validation Set Size 756 [for Mix ATIS] 2198 [for Mix SNIPS] |
| Hardware Specification | Yes | All the experiments are conducted at an Nvidia V100 GPU. |
| Software Dependencies | No | The paper mentions using an 'Adam optimizer' but does not provide specific version numbers for software dependencies like programming languages or libraries (e.g., Python, PyTorch, TensorFlow). |
| Experiment Setup | Yes | We leverage an Adam optimizer (Kingma and Ba 2015) with β1 = 0.9, β2 = 0.98, and 4k warm-up updates to optimize parameters in our framework, where we linearly increase the learning rate from 5e-4 to 1e-3. The batch size is set to 32. The number of encoder layers Ne is set to 4, the Transformer input and output dimension dmodel is set to 128, the number of the attention heads is set to 8, and the dropout ratio is set to 0.1. |