In-Place Zero-Space Memory Protection for CNN

Authors: Hui Guan, Lin Ning, Zhen Lin, Xipeng Shen, Huiyang Zhou, Seung-Hwan Lim

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on VGG16, Res Net18, and Squeeze Net validate the effectiveness of the proposed solution.
Researcher Affiliation Collaboration Hui Guan1, Lin Ning1, Zhen Lin1, Xipeng Shen1, Huiyang Zhou1, Seung-Hwan Lim2 1North Carolina State University, Raleigh, NC, 27695 2Oak Ridge National Laboratory, Oak Ridge, TN 37831 {hguan2, lning, zlin4, xshen5, hzhou}@ncsu.edu, lims1@ornl.gov
Pseudocode No The paper describes algorithmic steps within the text and using mathematical formulations (e.g., in Section 4.1 'ADMM-based Training' and 'QAT with Throttling (QATT)'), but it does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper does not provide an explicit statement about releasing its source code for the methodology, nor does it provide a link to a code repository.
Open Datasets Yes By default, We use the Image Net dataset [6] (ILSVRC 2012) for model training and evaluation.
Dataset Splits No The paper states using the Image Net dataset for model training and evaluation but does not specify the exact percentages or counts for training, validation, or test splits. It implicitly uses validation during WOT training (Figure 4) but no explicit split details are provided.
Hardware Specification Yes All the experiments are performed with Py Torch 1.0.1 on machines equipped with a 40-core 2.2GHz Intel Xeon Silver 4114 processor, 128GB of RAM, and an NVIDIA TITAN Xp GPU with 12GB memory.
Software Dependencies Yes All the experiments are performed with Py Torch 1.0.1... Distiller [32] is used for 8-bit quantization. The CUDA version is 10.1.
Experiment Setup Yes We set λ to 0.0001 for all of the CNNs. Model training uses stochastic gradient descent with a constant learning rate 0.0001 and momentum 0.9. Batch size is 32 for VGG16_bn and Res Net152, 64 for Res Net50 and VGG16, and 128 for the remaining models. Training stops as long as the model accuracy after weight throttling reaches its 8-bit quantized version.