Extracting Robust Models with Uncertain Examples
Authors: Guanlin Li, Guowen Xu, Shangwei Guo, Han Qiu, Jiwei Li, Tianwei Zhang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that BEST outperforms existing attack methods over different datasets and model architectures under limited data. |
| Researcher Affiliation | Collaboration | Guanlin Li1,2, Guowen Xu1, , Shangwei Guo3, Han Qiu4,5, Jiwei Li6,7, Tianwei Zhang1 1Nanyang Technological University, 2S-Lab, NTU, 3Chongqing University,4Tsinghua University 5Zhongguancun Laboratory, 6Shannon.AI, 7Zhejiang University. |
| Pseudocode | Yes | Algorithm 1 Boundary Entropy Searching Thief |
| Open Source Code | Yes | Our codes can be found in https://github.com/Guanlin Lee/BEST. |
| Open Datasets | Yes | We choose two datasets: CIFAR10 (Krizhevsky et al., 2009) and CIFAR100 (Krizhevsky et al., 2009) |
| Dataset Splits | Yes | In our implementation, we split the test sets of CIFAR10 and CIFAR100 into two disjointed parts: (1) an extraction set DA is used by the adversary to steal the victim model; (2) a validation set DT is used to evaluate the attack results and the victim model s performance during its training process. Both DA and DT contain 5,000 samples. |
| Hardware Specification | Yes | Specifically, when BS = 10, for Res Net-18, it costs about 16s on V100 to generate 5,000 UEs. For WRN-28-10, it costs about 80s on V100 to generate 5,000 UEs. ... When training a Res Net-18 on a single V100 card, the time cost for one epoch is 17s. When training a Wide Res Net-28-10 on a single V100 card, the time cost for one epoch is 80s. |
| Software Dependencies | No | The paper mentions software components like SGD (optimizer) but does not provide specific version numbers for any software, libraries, or frameworks (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | In all experiments, the learning rate of model extraction is set as 0.1 at the beginning and decays at the 100th and 150th epoch with a factor of 0.1. The optimizer in all experiments is SGD, with a start learning rate of 0.1, momentum of 0.9 and weight decay of 0.0001. The total number of extraction epochs is 200. In each epoch, the adversary queries all data in his training set DA. The batch size is 128. The hyperparameter in JBDA for Jacobian matrix multiplication is β = 0.1. For ARD, IAD, RSLAD and our BEST, the hyperparameters for query sample generation under L -norm are ϵ = 8/255, η = 2/255 and BS = 10. |