Topological Detection of Trojaned Neural Networks
Authors: Songzhu Zheng, Yikai Zhang, Hubert Wagner, Mayank Goswami, Chao Chen
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Compared to standard baselines it displays better performance on multiple benchmarks. experiments on synthetic and competition datasets show that our method is highly effective, outperforming existing approaches. |
| Researcher Affiliation | Collaboration | 1Stony Brook University, {zheng.songzhu,chao.chen.1}@stonybrook.edu 2Morgan Stanley, Yikai.Zhang@morganstanley.com 3University of Florida, hwagner@ufl.edu 4City University of New York, mayank.goswami@qc.cuny.edu |
| Pseudocode | Yes | Algorithm 1 Topological Abnormality Trojan Detection |
| Open Source Code | Yes | The code of this paper can be found at https://github.com/TopoXLab/TopoTrojDetection. |
| Open Datasets | Yes | We generate our synthetic dataset using NIST trojai toolkit5. In synthetic datasets, we trained 140 LeNet5 [40] and 120 ResNet18 [25] with MNIST [40] separately. We also trained 120 ResNet18 and 120 Densenet121 [31] with CIFAR10 [34] separately. We also test our methods using IARPA/NIST trojai competition public dataset [50]. |
| Dataset Splits | No | No explicit mention of validation dataset splits (e.g., percentages or counts for training, validation, and test sets) or a specific validation procedure was found. The paper discusses training and testing. |
| Hardware Specification | No | No specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments are mentioned in the paper. |
| Software Dependencies | No | The paper mentions tools like 'NIST trojai toolkit' but does not provide specific version numbers for this or any other software libraries, frameworks (e.g., PyTorch, TensorFlow), or operating systems used in the experimental setup. |
| Experiment Setup | Yes | we manually applied 20% one-to-one Trojan attack. Specifically, for Trojaned databases, we picked one of the source classes and added a reverse-lambda-shaped trigger (Figure 1) to a random corner of the input images. Then we changed the edited images class to a predetermined target class and mixed them into the training database. Trojaned models trained with MNIST datasets are constrained to maintain at least 95% successful attack rate... and models trained with CIFAR10 are constrained to maintain at least 87% successful attack rate. We use linear interpolation as the triggered sample generation method with hyper-parameter 0.3. |