Stochastic Precision Ensemble: Self-Knowledge Distillation for Quantized Deep Neural Networks

Authors: Yoonho Boo, Sungho Shin, Jungwook Choi, Wonyong Sung6794-6802

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the superior performance and efficiency of our SPEQ on various applications, including CIFAR10/CIFAR100/Image Net image classification and also transfer learning scenarios such as BERT-based questionanswering and flower classification.
Researcher Affiliation Academia 1 Seoul National University, Seoul, Korea 2 Hanyang University, Seoul, Korea
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks. Figure 1 illustrates the structure of the training scheme, but it is not a pseudocode.
Open Source Code No The paper does not include an unambiguous statement or link indicating that the authors have released the source code for the methodology described in this paper. The provided links are to third-party resources (TensorFlow's EfficientNet and Google Research's BERT).
Open Datasets Yes We demonstrate the superior performance and efficiency of our SPEQ on various applications, including CIFAR10/CIFAR100/Image Net image classification and also transfer learning scenarios such as BERT-based questionanswering and flower classification. fine-tuned using the Stanford Question Answering Dataset (SQu AD1.1) (Rajpurkar et al. 2016). We expand the experiment for transfer learning using Oxford Flowers-102 (Nilsback and Zisserman 2008).
Dataset Splits Yes Table 5: Top-1 validation accuracy (%) on the Image Net dataset. Table 9: Validation accuracy (%) on the Flowers-102 dataset according to the training method and the feature extractor.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running its experiments.
Software Dependencies No The paper mentions using "TensorFlow" for obtaining a pretrained model, but it does not specify any software names with version numbers for its own implementation or dependencies (e.g., Python version, PyTorch version, specific library versions).
Experiment Setup Yes The training procedures in our experiments consist of three steps: train the floating-point DNN (pretrain), train the QDNN to the target precision initialized from the floatingpoint parameters (retrain (Hwang and Sung 2014; Choi et al. 2018)), and train the QDNN using the SPEQ method initialized with the retrained parameters. The details of the experimental settings for each task explained in Appendix D. We set the high precision, n H, to 8 bits. For all the rest of the experiments, we set the quantization probability, u, to 0.5.