ContrastNet: A Contrastive Learning Framework for Few-Shot Text Classification

Authors: Junfan Chen, Richong Zhang, Yongyi Mao, Jie Xu10492-10500

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on 8 few-shot text classification datasets show that Contrast Net outperforms the current state-of-the-art models.
Researcher Affiliation Academia 1SKLSDE, School of Computer Science and Engineering, Beihang University, Beijing, China 2School of Electrical Engineering and Computer Science, University of Ottawa, Ottawa, Canada 3Department of Computer Science, University of Leeds, UK
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code via a specific repository link, explicit code release statement, or code in supplementary materials.
Open Datasets Yes We evaluate our few-shot text classification models on 8 text classification datasets, including 4 intent classification datasets: Banking77 (Casanueva et al. 2020), HWU64 (Liu et al. 2019a), Clinic150 (Larson et al. 2019), Liu57 (Liu et al. 2019b) and 4 news or review classification datasets: Huff Post (Bao et al. 2020), Amazon (He and Mc Auley 2016), Reuters (Bao et al. 2020), 20News (Lang 1995).
Dataset Splits Yes For each run, the training, validation, and testing classes are randomly re-split from the total class set.
Hardware Specification Yes All experiments are run on a single NVIDIA Tesla V100 PCIe 32GB GPU.
Software Dependencies No We implement the proposed models using Pytorch deep learning framework. The paper mentions a software framework but does not provide a specific version number or other detailed software dependencies.
Experiment Setup Yes The temperature factors of loss Lcon, Ltask and Linst are set to 5.0, 7.0 and 7.0, respectively. The loss weight α is initialized to 0.95 and decrease during training using the loss annealing strategy (Dopierre, Gravier, and Logerais 2021), and the loss weight β is set to 0.1. We optimize the models using Adam (Kingma and Ba 2015) with an initialized learning rate of 1e 6. All the hyper-parameters are selected by greedy search on the validation set.