reproducibilityindex.ai

CINS: Comprehensive Instruction for Few-Shot Learning in Task-Oriented Dialog Systems

Authors: Fei Mi, Yasheng Wang, Yitong Li11076-11084

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted on these To D tasks in realistic fewshot learning scenarios with small validation data. Empirical results demonstrate that the proposed CINS approach consistently improves techniques that finetune PLMs with raw input or short prompt.
Researcher Affiliation	Industry	1 Huawei Noah s Ark Lab 2 Huawei Technologies Co., Ltd. {mifei2,wangyasheng,liyitong3}@huawei.com
Pseudocode	No	The paper describes methods but does not include any formal pseudocode blocks or algorithms.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for their proposed method is open-source or publicly available.
Open Datasets	Yes	OOS For intent classification, we use a benchmark dataset from Larson et al. (2019). ... We evaluate dialog state tracking task using Multi WOZ2.0 (Budzianowski et al. 2018). ... Few Shot SGD Kale and Rastogi (2020a) is the version of the schema-guided-dataset (Rastogi et al. 2019) for natural language generation.
Dataset Splits	Yes	It contains 8,420/1,000/1,000 dialogues for train/validation/test spanning over 7 domains. ... The full train/validation/test sets contain 160k/24k/42k utterances.
Hardware Specification	Yes	We use 4 NVIDIA V100 GPUs for all of our experiments.
Software Dependencies	No	The paper mentions using 'T5-small' and 'T5-base' models via 'the huggingface repository' and 'Adam W optimizer', but it does not specify exact version numbers for the HuggingFace library, T5, or Adam W.
Experiment Setup	Yes	All models are trained using Adam W (Loshchilov and Hutter, 2018) optimizer with the initial learning rate of 1e-4 for DST and NLG, and 3e-4 for IC. In all experiments, we train the models with batch size 8 for 30 epochs for IC, 20 epochs for DST, and 50 epochs for NLG. Early stop according to the loss on the validation set.