reproducibilityindex.ai

Seq2Sick: Evaluating the Robustness of Sequence-to-Sequence Models with Adversarial Examples

Authors: Minhao Cheng, Jinfeng Yi, Pin-Yu Chen, Huan Zhang, Cho-Jui Hsieh3601-3608

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply our algorithm to machine translation and text summarization tasks, and verify the effectiveness of the proposed algorithm: by changing less than 3 words, we can make seq2seq model to produce desired outputs with high success rates. ... We conduct experiments on two widely-used applications of seq2seq model: text summarization and machine translation.
Researcher Affiliation	Collaboration	Minhao Cheng,1 Jinfeng Yi,2 Pin-Yu Chen,3 Huan Zhang,1 Cho-Jui Hsieh1 1Department of Computer Science, UCLA, 2JD AI Research,3IBM Research
Pseudocode	Yes	Algorithm 1 Seq2Sick algorithm Input: input sequence x = {x1, . . . , x N}, seq2seq model, target keyword {k1, . . . , k T } Output: adversarial sequence x = x + δ Let s = {s1, . . . , s M} denote the original output of x. Set the loss L( ) in (9) to be (3) if Targeted Keyword Attack then Set the loss L( ) in (9) to be (7) end if for r = 1, 2, . . . , T do back-propagation L to achieve gradient δL(x + δr) for i = 1, 2, . . . , N do δr,i = 0 if δr,i > ηλ1 then δr,i = δr,i ηλ1 δr,i δr,i end if end for yr+1 = δr + η δL(x + δr) δr+1 = arg min x+δr+1 W end for δ = δT
Open Source Code	Yes	Our source code is publicly available at https://github.com/cmhcbb/Seq2Sick.
Open Datasets	Yes	We use three datasets DUC2003, DUC2004, and Gigaword, to conduct our attack for the text summarization task. ... For the machine translation task, we use 500 samples from WMT 16 Multimodal Translation task. ... we train our model using 453k pairs from the Europal corpus of German-English WMT 15, common crawl and news-commentary.
Dataset Splits	No	The paper mentions using training data and evaluation on test samples, but it does not explicitly specify a 'validation' dataset split with percentages, sample counts, or a distinct validation set.
Hardware Specification	No	The paper does not specify any particular hardware details such as GPU models, CPU models, or memory used for experiments.
Software Dependencies	No	The paper mentions using 'Open NMT-py' but does not provide specific version numbers for it or any other software dependencies.
Experiment Setup	Yes	We set the beam search size to be 5 as suggested. ... The architecture consists of a 2-layer stacked LSTM with 500 hidden units. ... We set λ = 1 in all non-overlapping experiments. ... We set λ1 = λ2 = 1 in our objective function (9) in all our experiments.