reproducibilityindex.ai

Topological Detection of Trojaned Neural Networks

Authors: Songzhu Zheng, Yikai Zhang, Hubert Wagner, Mayank Goswami, Chao Chen

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Compared to standard baselines it displays better performance on multiple benchmarks. experiments on synthetic and competition datasets show that our method is highly effective, outperforming existing approaches.
Researcher Affiliation	Collaboration	1Stony Brook University, {zheng.songzhu,chao.chen.1}@stonybrook.edu 2Morgan Stanley, Yikai.Zhang@morganstanley.com 3University of Florida, hwagner@ufl.edu 4City University of New York, mayank.goswami@qc.cuny.edu
Pseudocode	Yes	Algorithm 1 Topological Abnormality Trojan Detection
Open Source Code	Yes	The code of this paper can be found at https://github.com/TopoXLab/TopoTrojDetection.
Open Datasets	Yes	We generate our synthetic dataset using NIST trojai toolkit5. In synthetic datasets, we trained 140 LeNet5 [40] and 120 ResNet18 [25] with MNIST [40] separately. We also trained 120 ResNet18 and 120 Densenet121 [31] with CIFAR10 [34] separately. We also test our methods using IARPA/NIST trojai competition public dataset [50].
Dataset Splits	No	No explicit mention of validation dataset splits (e.g., percentages or counts for training, validation, and test sets) or a specific validation procedure was found. The paper discusses training and testing.
Hardware Specification	No	No specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments are mentioned in the paper.
Software Dependencies	No	The paper mentions tools like 'NIST trojai toolkit' but does not provide specific version numbers for this or any other software libraries, frameworks (e.g., PyTorch, TensorFlow), or operating systems used in the experimental setup.
Experiment Setup	Yes	we manually applied 20% one-to-one Trojan attack. Speciﬁcally, for Trojaned databases, we picked one of the source classes and added a reverse-lambda-shaped trigger (Figure 1) to a random corner of the input images. Then we changed the edited images class to a predetermined target class and mixed them into the training database. Trojaned models trained with MNIST datasets are constrained to maintain at least 95% successful attack rate... and models trained with CIFAR10 are constrained to maintain at least 87% successful attack rate. We use linear interpolation as the triggered sample generation method with hyper-parameter 0.3.