reproducibilityindex.ai

Few-shot Learning for Multi-label Intent Detection

Authors: Yutai Hou, Yongkui Lai, Yushan Wu, Wanxiang Che, Ting Liu13036-13044

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on two datasets show that the proposed model signiﬁcantly outperforms strong baselines in both oneshot and ﬁve-shot settings.
Researcher Affiliation	Academia	Yutai Hou, Yongkui Lai, Yushan Wu, Wanxiang Che* , Ting Liu School of Computer Science and Technology, Harbin Institute of Technology, China {ythou, yklai, car, tliu}@ir.hit.edu.cn, wuyushan@hit.edu.cn
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	Data and code are available at https://github.com/Atma Hou/Few Shot Multi Label.
Open Datasets	Yes	We conduct experiments on public dataset Tour SG (Williams et al. 2012) and introduce a new multi-intent dataset Stanford LU. These two datasets contain multiple domains and thus allow to simulate the few-shot situation on unseen domains. Tour SG (DSTC-4) contains 25,751 utterances annotated with multiple dialogue acts and 6 separated domains... Stanford LU is an re-annotated version of Stanford dialogue dataset (Eric et al. 2017) containing 8,038 user utterances from 3 domains: Schedule (Sc), Navigate (Na), Weather (We).
Dataset Splits	Yes	Each time, we pick one target domain for testing, one domain for development, and use the rest domains of the same dataset as source domains for training. For example of the Tour SG dataset, each round, model is trained on 4 * 100 * 16 = 6400 samples, validated on 1 * 50 * 16 = 1600 samples, and tested on 1 * 50 * 16 = 800 samples.
Hardware Specification	No	The paper mentions using Electra-small (14M params) and BERT-base (110M params) as embedders but does not provide specific details about the hardware (e.g., GPU, CPU models) used for experiments.
Software Dependencies	No	The paper mentions using BERT, Electra-small, and ADAM optimizer but does not specify software versions for programming languages, libraries, or frameworks (e.g., Python version, PyTorch version).
Experiment Setup	Yes	We use ADAM (Kingma and Ba 2015) to train the models with batch size 4. Learning rate is set as 1e-5 for both our model and baseline models. We set α (Eq. 1) as 0.3 and vary β (Eq. 2) in {0.1, 0.5, 0.9} considering label name s anchoring power with different datasets and support-set sizes. For the MLP of kernel regression, we employ Re LU as activation function and vary the layers in {1, 2, 3} and hidden dimension in {5, 10, 20}. The best hyperparameter are determined on the development domains.