reproducibilityindex.ai

Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness

Authors: Tianlong Chen, Huan Zhang, Zhenyu Zhang, Shiyu Chang, Sijia Liu, Pin-Yu Chen, Zhangyang Wang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Experiments. Datasets and architectures. Our experiments are conducted on three representative datasets in adversarial robustness and verification literature, MNIST (Deng, 2012), SVHN (Netzer et al., 2011) and CIFAR-10 (Krizhevsky & Hinton, 2009).
Researcher Affiliation	Collaboration	1University of Texas at Austin 2Carnegie Mellon University 3University of California, Santa Barbara 4Michigan State University 5MIT-IBM Watson AI Lab 6IBM Research.
Pseudocode	No	The paper describes procedures in text and figures (e.g., Figure 1), but it does not contain any formal pseudocode or algorithm blocks.
Open Source Code	Yes	Codes are available at https://github.com/ VITA-Group/Linearity-Grafting.
Open Datasets	Yes	Our experiments are conducted on three representative datasets in adversarial robustness and verification literature, MNIST (Deng, 2012), SVHN (Netzer et al., 2011) and CIFAR-10 (Krizhevsky & Hinton, 2009).
Dataset Splits	No	The paper discusses evaluation on “test sets” and notes that “VA is computed on the first 1, 000 images,” but it does not provide specific train/validation/test dataset splits (e.g., percentages or counts) for reproducibility.
Hardware Specification	No	OOM indicates that DNNs have too many unstable neurons and the verifier is unable to load it with 48 GB GPU memory, leading to verification time and a null VA ( ).
Software Dependencies	No	The paper mentions using an SGD optimizer and cosine annealing schedule, but it does not specify any software names with their version numbers (e.g., PyTorch version, CUDA version).
Experiment Setup	Yes	For fast adversarial training (Wong et al., 2020), we adopt the effective Grad Align regularization (Andriushchenko & Flammarion, 2020) with a coefficient of 0.2, for all 200 training epochs. The learning rate starts from 0.1 and decays by ten times at epochs 100 and 150, while the batch size is 128. We use an SGD optimizer with 0.9 momentum and 5 10 4 weight decay. During the finetuning of grafted networks, an initial learning rate of 0.01 is used for trainable slopes and intercept (a, b) of grafted neurons, and 0.001 for original model parameters. And the learning rate decays with a cosine annealing schedule of 100 training epochs.