Discovering New Intents via Constrained Deep Adaptive Clustering with Cluster Refinement
Authors: Ting-En Lin, Hua Xu, Hanlei Zhang8360-8367
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on the three benchmark datasets show that our method can yield significant improvements over strong baselines. |
| Researcher Affiliation | Academia | 1State Key Laboratory of Intelligent Technology and Systems, Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China 2 Beijing National Research Center for Information Science and Technology(BNRist), Beijing 100084, China 3 School of Computer and Information Technology, Beijing Jiaotong University, Beijing 100044, China |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/thuiar/CDAC-plus |
| Open Datasets | Yes | We conduct experiments on three publicly available short text datasets. The detailed statistics are shown in Table 1. |
| Dataset Splits | Yes | Besides, we divide all dataset into training, validation, and test sets. First, we train the model by limited labeled data (containing known intents) and unlabeled data (containing all intents) in the training set. Second, we tune the model on the validation set, which only contains known intents. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'implemented in Py Torch' and 'pre-trained BERT model', but does not specify version numbers for these software components. |
| Experiment Setup | Yes | The training batch size is 256, and the learning rate is 5e-5. We use the same dynamic thresholds as DAC (Chang et al. 2017) and set u(λ) = 0.95 λ, l(λ) = 0.455 + 0.1 λ, and η = 0.009. During the refinement stage, we perform K-means on intent representation I to obtain the initial cluster centroids U and set the stop criteria δlabel as 0.1%. |