MINIMAL: Mining Models for Universal Adversarial Triggers

Authors: Yaman Kumar Singla, Swapnil Parekh, Somesh Singh, Changyou Chen, Balaji Krishnamurthy, Rajiv Ratn Shah11330-11339

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Using the triggers produced with our data-free algorithm, we reduce the accuracy of Stanford Sentiment Treebank s positive class from 93.6% to 9.6%. Similarly, for the Stanford Natural Language Inference (SNLI), our single-word trigger reduces the accuracy of the entailment class from 90.95% to less than 0.6%.
Researcher Affiliation Collaboration Yaman Kumar Singla*1,2,3, Somesh Singh 2, Swapnil Parekh 4, Balaji Krishnamurthy1, Rajiv Ratn Shah2, Changyou Chen3 1Adobe Media Data Science Research, 2IIIT-Delhi, 3SUNY at Buffalo, 4New York University
Pseudocode No The paper describes algorithms using figures (Fig. 2 and Fig. 3) and equations, but it does not present formal pseudocode blocks or clearly labeled algorithm sections in a structured, code-like format.
Open Source Code Yes 1The code and reproducibility steps are given in https://github. com/midas-research/data-free-uats
Open Datasets Yes We use the Stanford Sentiment Treebank (SST) dataset (Socher et al. 2013). For natural language inference, we use the well-known Stanford Natural Language Inference (SNLI) Corpus (Bowman et al. 2015). For paraphrase identification, we use the Microsoft Research Paraphrase Corpus (MRPC) (Dolan and Brockett 2005).
Dataset Splits Yes Table 1: Number of Samples required to generate Universal Adversarial Triggers for each Dataset. In a data-based approach like (Wallace et al. 2019), validation set (column 2) is used to generate the UATs. SST 900, SNLI 9000, MRPC 800
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions models and embeddings like 'Bi-LSTM model with word2vec' or 'ALBERT model', but it does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow, or other libraries).
Experiment Setup No The paper describes the algorithms for generating class impressions and triggers, and discusses varying initialization sequences and trigger lengths, but it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs, optimizer settings) or detailed system-level training configurations for the models used in experiments.