Defensive Quantization: When Efficiency Meets Robustness
Authors: Ji Lin, Chuang Gan, Song Han
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first conduct an empirical study to show that vanilla quantization suffers more from adversarial attacks. ... Extensive experiments on CIFAR-10 and SVHN datasets demonstrate that our new quantization method can defend neural networks against adversarial examples, and even achieves superior robustness than their fullprecision counterparts, while maintaining the same hardware efficiency as vanilla quantization approaches. |
| Researcher Affiliation | Collaboration | Ji Lin MIT jilin@mit.edu Chuang Gan MIT-IBM Watson AI Lab ganchuang@csail.mit.edu Song Han MIT songhan@mit.edu |
| Pseudocode | No | The paper presents the objective function as L = LCE + β ||WT l Wl I||2 in Equation (8) and a diagram in Figure 2, but does not include a clearly labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not provide any explicit statements about releasing source code or links to a code repository. |
| Open Datasets | Yes | We conduct experiments with Wide Res Net (Zagoruyko & Komodakis, 2016) of 28 10 on the CIFAR-10 dataset (Krizhevsky & Hinton, 2009) ... on the Street View House Number dataset (SVHN) (Netzer et al., 2011). |
| Dataset Splits | Yes | CIFAR-10 is another widely used dataset containing 50,000 training samples and 10,000 testing samples of size 32 32. ... We conduct experiments with Wide Res Net (Zagoruyko & Komodakis, 2016) of 28 10 on the CIFAR-10 dataset (Krizhevsky & Hinton, 2009) using Re LU6 based activation quantization, with number of bits ranging from 1 to 5. All the models are trained following (Zagoruyko & Komodakis, 2016) with momentum SGD for 200 epochs. |
| Hardware Specification | No | We acknowledge Google and Amazon providing us the cloud computing resources. ... NVIDIA recently introduced INT4 in Turing architecture. These mentions are not specific enough to identify the hardware used for experiments (e.g., exact GPU model, CPU, RAM, or specific cloud instance types). |
| Software Dependencies | No | Tensor Flow (Abadi et al., 2016) is mentioned as a framework, but no specific version numbers are provided for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | All the models are trained following (Zagoruyko & Komodakis, 2016) with momentum SGD for 200 epochs. The adversarial samples are generated using FGSM attacker with ϵ = 8. For CIFAR-10, the model is trained for 200 epochs with initial learning rate 0.1, decayed by a factor of 0.2 at 60, 120 and 160 epochs. For SVHN dataset, the model is trained for 160 epochs, with initial learning rate 0.01, decayed by 0.1 at 80 and 120 epochs. For DQ, we used bit=1 and β = 2e-3. |