Explainable Text Classification via Attentive and Targeted Mixing Data Augmentation
Authors: Songhao Jiang, Yan Chu, Zhengkui Wang, Tianxing Ma, Hanlin Wang, Wenxuan Lu, Tianning Zang, Bo Wang
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The results indicate that ATMIX is more effective with higher explainability than the typical classification models, hidden-level, and input-level mixup models. We extensively evaluate our proposed model over multiple text datasets. |
| Researcher Affiliation | Collaboration | Songhao Jiang1,4,5 , Yan Chu2 , Zhengkui Wang3 , Tianxing Ma1,4 , Hanlin Wang2 , Wenxuan Lu1,4 , Tianning Zang1,4 and Bo Wang5 1Institute of Information Engineering, Chinese Academy of Sciences 2Harbin Engineering University 3Info Comm Technology Cluster, Singapore Institute of Technology 4School of Cyber Security, University of Chinese Academy of Sciences 5CNCERT/CC |
| Pseudocode | Yes | Algorithm 1 Framework of DSA |
| Open Source Code | No | The paper does not provide any explicit statements about releasing their own source code or links to a repository for the described methodology. Mentions of Huggingface Hub, NLTK, and Word Net refer to external resources used, not their own implementation code. |
| Open Datasets | Yes | To verify the effectiveness of ATMIX, we used five typical sentiment classification datasets, namely three two-category datasets SST-2 [Socher et al., 2013], YELP-21, and IMDB [Maas et al., 2011], and two five-category datasets SST-1 [Socher et al., 2013] and YELP-51. Table 1: The detailed statistics of the experimental datasets. |
| Dataset Splits | No | The paper provides train/test sizes in Table 1 and mentions specific percentages for training data selection (e.g., "randomly select 1% of YELP-2, YELP-5, and 20% of IMDB for the training"). However, it does not explicitly provide details about a validation dataset split or its size. |
| Hardware Specification | Yes | All experiments run on NVIDIA Tesla A100 GPUs and the epoch values are 10. |
| Software Dependencies | No | We use Tensorflow to reproduce Text CNN and use Py Torch to reproduce ALBERT and BERT. For ALBERT, we use the ALBERT-basev2 pre-trained model from Huggingface Hub... For BERT, we use the BERT-base-uncased pre-trained model from Huggingface Hub... we use the NLTK tool and the Word Net corpus. No specific version numbers are provided for Tensorflow, PyTorch, or NLTK. |
| Experiment Setup | Yes | This section provides all the detailed parameter settings for different models. For Text CNN, ... the dropout value is 0.5 without L2 regularization. The maximum input length is 128, and the batch size is 16. For ALBERT, ...The maximum length and the batch size of the input sequence are 256 and 20 on IMDB, while 128 and 30 on the other datasets. For BERT, ...The maximum length of the input sequence is 256 on IMDB, while 128 on the other datasets. ...α = 0.2 for Mixup and TMix, and window size is 10% for SSMix. All experiments run on NVIDIA Tesla A100 GPUs and the epoch values are 10. |