Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

Authors: Minhao Cheng, Jinfeng Yi, Pin-Yu Chen, Huan Zhang, Cho-Jui Hsieh3601-3608

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our algorithm to machine translation and text summarization tasks, and verify the effectiveness of the proposed algorithm: by changing less than 3 words, we can make seq2seq model to produce desired outputs with high success rates. ... We conduct experiments on two widely-used applications of seq2seq model: text summarization and machine translation.
Researcher Affiliation Collaboration Minhao Cheng,1 Jinfeng Yi,2 Pin-Yu Chen,3 Huan Zhang,1 Cho-Jui Hsieh1 1Department of Computer Science, UCLA, 2JD AI Research,3IBM Research
Pseudocode Yes Algorithm 1 Seq2Sick algorithm Input: input sequence x = {x1, . . . , x N}, seq2seq model, target keyword {k1, . . . , k T } Output: adversarial sequence x = x + δ Let s = {s1, . . . , s M} denote the original output of x. Set the loss L( ) in (9) to be (3) if Targeted Keyword Attack then Set the loss L( ) in (9) to be (7) end if for r = 1, 2, . . . , T do back-propagation L to achieve gradient δL(x + δr) for i = 1, 2, . . . , N do δr,i = 0 if δr,i > ηλ1 then δr,i = δr,i ηλ1 δr,i δr,i end if end for yr+1 = δr + η δL(x + δr) δr+1 = arg min x+δr+1 W end for δ = δT
Open Source Code Yes Our source code is publicly available at https://github.com/cmhcbb/Seq2Sick.
Open Datasets Yes We use three datasets DUC2003, DUC2004, and Gigaword, to conduct our attack for the text summarization task. ... For the machine translation task, we use 500 samples from WMT 16 Multimodal Translation task. ... we train our model using 453k pairs from the Europal corpus of German-English WMT 15, common crawl and news-commentary.
Dataset Splits No The paper mentions using training data and evaluation on test samples, but it does not explicitly specify a 'validation' dataset split with percentages, sample counts, or a distinct validation set.
Hardware Specification No The paper does not specify any particular hardware details such as GPU models, CPU models, or memory used for experiments.
Software Dependencies No The paper mentions using 'Open NMT-py' but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup Yes We set the beam search size to be 5 as suggested. ... The architecture consists of a 2-layer stacked LSTM with 500 hidden units. ... We set λ = 1 in all non-overlapping experiments. ... We set λ1 = λ2 = 1 in our objective function (9) in all our experiments.