Vision-fused Attack: Advancing Aggressive and Stealthy Adversarial Text against Neural Machine Translation

Authors: Yanni Xue, Haojie Hao, Jiakai Wang, Qiang Sheng, Renshuai Tao, Yu Liang, Pu Feng, Xianglong Liu

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on various models, including large language models (LLMs) like LLa MA and GPT-3.5, strongly support that VFA outperforms the comparisons by large margins (up to 81%/14% improvements on ASR/SSIM). To demonstrate the effectiveness of the proposed method, we conduct extensive experiments under white-box and black-box settings on various representative models and widely-used datasets, including open-source and closed-source large language models like GPT-3.5 and LLa MA. The experimental results strongly support that our VFA outperforms the comparisons by large margins.
Researcher Affiliation Collaboration Yanni Xue1 , Haojie Hao1 , Jiakai Wang2 , Qiang Sheng4 , Renshuai Tao5 , Yu Liang6 , Pu Feng1 and Xianglong Liu1,2,3 1State Key Laboratory of Complex & Critical Software Environment, Beihang University, Beijing, China 2Zhongguancun Laboratory, Beijing, China 3Institute of Data Space, Hefei Comprehensive National Science Center, Anhui, China 4Institute of Computing Technology, Chinese Academy of Sciences 5Beijing Jiaotong University 6Beijing University of Technology
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Codes can be found at https://github.com/Levelower/VFA.
Open Datasets Yes We choose the validation set of WMT19 [Ng et al., 2019], WMT18 [Bojar et al., 2018], and TED [Cettolo et al., 2016] for the Chinese-English (Zh En) translation task and the test set of ASPEC [Nakazawa et al., 2016] for the Japanese-English (Ja-En) translation task. These datasets and models are widely used in previous studies.
Dataset Splits No The paper mentions using the 'validation set of WMT19, WMT18, and TED' but does not specify the exact percentages or sample counts for training, validation, or test splits. It implicitly refers to predefined splits but does not detail them.
Hardware Specification Yes We conduct experiments in a cluster of NVIDIA GeForce RTX 3090 GPUs.
Software Dependencies No The paper mentions using 'Hugging Face’s Marian Model', 'Hugging Face sentence-transformer model', 'Bert architecture model', and 'Faiss'. However, it does not provide specific version numbers for these software components.
Experiment Setup Yes As for the hyperparameter settings, we set the global perception constraint to 0.95 and the replacement rate to 0.2.