ContrastNet: A Contrastive Learning Framework for Few-Shot Text Classification
Authors: Junfan Chen, Richong Zhang, Yongyi Mao, Jie Xu10492-10500
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on 8 few-shot text classification datasets show that Contrast Net outperforms the current state-of-the-art models. |
| Researcher Affiliation | Academia | 1SKLSDE, School of Computer Science and Engineering, Beihang University, Beijing, China 2School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada 3Department of Computer Science, University of Leeds, UK |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide concrete access to source code via a specific repository link, explicit code release statement, or code in supplementary materials. |
| Open Datasets | Yes | We evaluate our few-shot text classification models on 8 text classification datasets, including 4 intent classification datasets: Banking77 (Casanueva et al. 2020), HWU64 (Liu et al. 2019a), Clinic150 (Larson et al. 2019), Liu57 (Liu et al. 2019b) and 4 news or review classification datasets: Huff Post (Bao et al. 2020), Amazon (He and Mc Auley 2016), Reuters (Bao et al. 2020), 20News (Lang 1995). |
| Dataset Splits | Yes | For each run, the training, validation, and testing classes are randomly re-split from the total class set. |
| Hardware Specification | Yes | All experiments are run on a single NVIDIA Tesla V100 PCIe 32GB GPU. |
| Software Dependencies | No | We implement the proposed models using Pytorch deep learning framework. The paper mentions a software framework but does not provide a specific version number or other detailed software dependencies. |
| Experiment Setup | Yes | The temperature factors of loss Lcon, Ltask and Linst are set to 5.0, 7.0 and 7.0, respectively. The loss weight α is initialized to 0.95 and decrease during training using the loss annealing strategy (Dopierre, Gravier, and Logerais 2021), and the loss weight β is set to 0.1. We optimize the models using Adam (Kingma and Ba 2015) with an initialized learning rate of 1e 6. All the hyper-parameters are selected by greedy search on the validation set. |