SALNet: Semi-supervised Few-Shot Text Classification with Attention-based Lexicon Construction
Authors: Ju-Hyoung Lee, Sang-Ki Ko, Yo-Sub Han13189-13197
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach using four publicly available benchmark datasets and compare the performance with the previous state-of-the-art methods including the other semisupervised learning algorithms (Yarowsky 1995; Jo and Cinarel 2019) and pretraining algorithms (Gururangan et al. 2019; Devlin et al. 2019). The experimental results demonstrate that our approach, Semi-supervised with Attentionbased Lexicon construction Network (SALNet), outperforms the previous state-of-the-art methods on four benchmark datasets. |
| Researcher Affiliation | Academia | Ju-Hyoung Lee1, Sang-Ki Ko2, Yo-Sub Han1 1Yonsei University, Seoul, Republic of Korea 2Kangwon National University, Kangwon, Republic of Korea |
| Pseudocode | No | The paper describes the proposed method in numbered steps within the “Methods” section (e.g., “1. Create a base classifier...”, “2. Re-run the base classifier...”), but these are presented in paragraph form and not as formal pseudocode or an algorithm block. |
| Open Source Code | No | The paper mentions other baselines’ code availability, for example, “Self-training (Yarowsky 1995): ... Since the source code is not available in public, we implement this method using their pseudo algorithm.” and “Delta-training (Jo and Cinarel 2019): Since the code was not published, we implement it ourselves...”, but there is no statement or link indicating that the authors’ own code for SALNet is publicly available. |
| Open Datasets | Yes | We use four benchmark datasets to evaluate the performance of our proposed method across different domains; IMDB review (Maas et al. 2011), AG News (Zhang, Zhao, and Le-Cun 2015), Yahoo! Answers (Chang et al. 2008), DBpedia (Mendes, Jakob, and Bizer 2012). |
| Dataset Splits | Yes | In the new labeled dataset, we use 85% of its data as a training set, and 15% of its data as a development set. We remove the labels of the remaining 99% data. All data have a balanced class distribution. We use the development set to determine early-stopping at each epoch. Table 2 presents the data distribution. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions using “pretrained GloVe”, “attention-based LSTM”, “Text CNN”, and the “Adam optimizer”, along with “BERT”, but does not provide specific version numbers for any of these software dependencies. |
| Experiment Setup | Yes | Hyperparameters We use pretrained Glo Ve (Pennington, Socher, and Manning 2014) as word embedding for all experiments. Glo Ve is trained on a dataset of 42 billion tokens with a vocabulary of 1.9 million words and has 300 dimension embedding vectors. Since attention-based LSTM (Wang et al. 2016) and Text CNN (Kim 2014) are simple and have a high performance, we select the two basic models as classifiers (Jo and Cinarel 2019). The Text CNN consists of filter windows of size 3, 4, 5 with 100 feature maps each of which is followed by Re LU activation and max-pooling. The attentionbased LSTM consists of 300 hidden sizes. We train all classifiers with a batch size of 128, and optimize them using the Adam optimizer (Kingma and Ba 2015) with 0.001 and 0.005 learning rates. Our proposed method, SALNet, uses size 50 of lexicons and three (=t1) and four (=t2) matching words for predicting a classes of unlabeled data. |