reproducibilityindex.ai

Accurate Post Training Quantization With Small Calibration Sets

Authors: Itay Hubara, Yury Nahshan, Yair Hanani, Ron Banner, Daniel Soudry

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate that this approach is: (1) much less susceptible to overﬁtting than the standard ﬁne-tuning approaches, and can be used even on a very small calibration set; and (2) more powerful than previous methods, which only set the activations dynamic ranges. ... Together, these methods yield state-of-the-art results for both vision and text models.
Researcher Affiliation	Collaboration	1Habana Labs An Intel company, Caesarea, Israel 2Department of Electrical Engineering Technion, Haifa, Israel.
Pseudocode	No	The paper describes methods using mathematical formulations and descriptive text but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We open-sourced our code https://github.com/papers-submission/CalibTIP.
Open Datasets	Yes	In this section, we demonstrate our methods and pipelines on several models and datasets. We ﬁrst start by analyzing image recognition models such as Res Net18/50, Mobile Net V2, which were trained over the Image Net dataset. Next, we demonstrate our method robustness by applying it on question answering task using the popular BERT model (Devlin et al., 2018), which was ﬁne-tuned on the SQu AD1.1 dataset (Rajpurkar et al., 2016).
Dataset Splits	Yes	In all our experiments, we used a small calibration set taken from the training dataset. ... For each method, we measured the top-1 accuracy with respect to the number of samples in the calibration set over ﬁve runs and present the mean and standard deviation. ... We use 1,000 samples from the training set as our calibration set. ... Throughout our experiments, we avoided using any augmentation technique and follow the standard (He et al., 2016) validation set prepossessing.
Hardware Specification	Yes	As an example, for Res Net50, even the most time-consuming version, seq-Ada Quant, takes less than 5 minutes on one device (Ge Force 1080).
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies, libraries, or frameworks used in the experiments.
Experiment Setup	Yes	In all our experiments, we used a small calibration set taken from the training dataset. Unless stated otherwise, we applied asymmetric per-channel quantization (i.e. GEMLOWP (Wu et al., 2016)) with quantized offset (i.e., zero point). ... In all experiments, we used 1000 samples from the training set as our calibration set. Our setting considers only a mixture of 8-bit and 4bit layers