Robust Fine-tuning via Perturbation and Interpolation from In-batch Instances
Authors: Shoujie Tong, Qingxiu Dong, Damai Dai, Yifan Song, Tianyu Liu, Baobao Chang, Zhifang Sui
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on various tasks in GLUE benchmark show that MATCH-TUNING consistently outperforms the vanilla fine-tuning by 1.64 scores. Moreover, MATCH-TUNING exhibits remarkable robustness to adversarial attacks and data imbalance. We conduct a comprehensive evaluation of MATCHTUNING on the GLUE benchmark. |
| Researcher Affiliation | Collaboration | 1Key Laboratory of Computational Linguistics, Peking University 2Tencent Cloud Xiaowei |
| Pseudocode | No | The paper includes mathematical equations for the proposed method but does not provide a structured pseudocode block or algorithm. |
| Open Source Code | Yes | Our code is available at https://github.com/tongshoujie/MATCH-TUNING |
| Open Datasets | Yes | We conduct experiments on four main datasets in GLUE [Wang et al., 2019] to evaluate the general performance. |
| Dataset Splits | No | The paper mentions evaluating on the 'GLUE development set' and 'Adv GLUE validation set' but does not specify exact split percentages or sample counts for these splits. It also mentions reporting results over '10 random seeds' but no details on how these seeds affect data partitioning. |
| Hardware Specification | Yes | All the methods are based on BERTLARGE and tested on a single NVIDIA A40 GPU. |
| Software Dependencies | No | The paper states, 'We conduct our experiments based on the Hugging Face transformers library'. However, it does not provide a specific version number for the library or any other software dependencies. |
| Experiment Setup | No | The paper states, 'We conduct our experiments based on the Hugging Face transformers library and follow the default hyper-parameters and settings unless noted otherwise.' It also mentions 'batch size' in a formula (n) but does not provide specific numerical values for hyperparameters like learning rate, batch size, or epochs used in the experiments. It only mentions '10 random seeds'. |