Effective Open Intent Classification with K-center Contrastive Learning and Adjustable Decision Boundary

Authors: Xiaokang Liu, Jianquan Li, Jingjing Mu, Min Yang, Ruifeng Xu, Benyou Wang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three benchmark datasets clearly demonstrate the effectiveness of our method for open intent classification.
Researcher Affiliation Collaboration Xiaokang Liu1*, Jianquan Li2*, Jingjing Mu2, Min Yang3 , Ruifeng Xu4, Benyou Wang5,6 1 China Automotive Technology and Research Center Co., Ltd. 2 Beijing Ultrapower Software Co.,Ltd. 3 Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences 4 Harbin Institute of Technology, Shenzhen 5 The Chinese University of Hong Kong (Shenzhen) 6 Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, China
Pseudocode No The paper describes methods textually and with mathematical equations but does not include explicit pseudocode or algorithm blocks.
Open Source Code Yes For reproducibility, we submit the code at: https://github.com/lxk00/CLAP
Open Datasets Yes We conduct extensive experiments on three publicly available benchmark datasets. BANKING (Casanueva et al. 2020) is a fine-grained dataset in the banking domain, which contains 77 intents and 13,083 customer service queries. OOS (Larson et al. 2019) is a dataset for intent classification and out-of-scope prediction. It consists of 150 intents, 22,500 in-domain queries and 1,200 out-of-domain queries. Stack Overflow (Xu et al. 2015) is a dataset released originally in Kaggle.com.
Dataset Splits Yes Dataset Class Train/Valid/Test Length (max/mean) BANKING 77 9003 / 1000 / 3080 79 / 11.91 OOS 150 15000 / 3000 / 5700 28 / 8.31 Stack Overflow 20 12000 / 2000 / 6000 41 / 9.18
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory specifications, or cloud instances).
Software Dependencies No The paper mentions using a 'pre-trained BERT model (base-uncased)' but does not provide specific version numbers for other key software dependencies or libraries (e.g., Python, PyTorch/TensorFlow versions).
Experiment Setup Yes We freeze all the parameters of BERT except the last transformer layer to speed up the training process and avoid over-fitting. The number of positive samples in KCCL are in 1 to 10 and the number of negative samples M is set to be 1. λ is set to be 0.25. In the second training stage, we freeze BERT model and train the decision boundary only. The batch size is set to be 32, e in range 0.5 to 1.2, s from 0 to 0.5, η from 0 to 1. We utilize Adam to optimize the model with a learning rate of 2e-5.