Handcrafted Backdoors in Deep Neural Networks

Authors: Sanghyun Hong, Nicholas Carlini, Alexey Kurakin

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our handcrafted backdoor attack on four benchmarking tasks MNIST, SVHN, CIFAR10, and Pub Figs and four different network architectures. Our results demonstrate the effectiveness of our backdoor attack: In all the backdoored models that we handcraft, we achieve an attack success rate 96% with only a small accuracy drop ( 3%).
Researcher Affiliation Collaboration Sanghyun Hong, Nicholas Carlini, Alexey Kurakin Oregon State University Google Brain
Pseudocode No The paper describes its handcrafting procedure in numbered steps but does not include formal pseudocode or algorithm blocks labeled as such.
Open Source Code No The paper does not contain an explicit statement about releasing its source code or provide any links to a code repository for the described methodology.
Open Datasets Yes We evaluate our handcrafted attack on four benchmark classification tasks used in prior backdooring work: MNIST [25], SVHN [33], CIFAR10 [23], and Pub Figs [38].
Dataset Splits No The paper mentions using a 'small subset of samples' for ablation analysis and 'test-time samples' but does not explicitly provide details about training, validation, or test splits (e.g., specific percentages or sample counts for each split).
Hardware Specification No The paper mentions that an attacker can be successful 'on a CPUs' but does not specify any particular CPU models (e.g., Intel Core i7, Xeon), GPU models, or other specific hardware configurations used for conducting the experiments.
Software Dependencies No The paper does not provide specific software dependency names with version numbers (e.g., names and versions of libraries, frameworks, or programming languages).
Experiment Setup Yes In Pub Figs, we fine-tune only the last layer of a teacher pre-trained on VGGFace2 (see Appendix D for the architecture details and the training hyperparameters we use).