Up or Down? Adaptive Rounding for Post-Training Quantization

Authors: Markus Nagel, Rana Ali Amjad, Mart Van Baalen, Christos Louizos, Tijmen Blankevoort

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate the performance of Ada Round, we conduct experiments on various computer vision tasks and models. In a comprehensive study, we show that Ada Round defines a new state-of-the-art for post-training quantization on several networks and tasks, including Resnet18, Resnet50, Mobilenet V2, Inception V3 and Deeplab V3.
Researcher Affiliation Industry 1Qualcomm AI Research, an initiative of Qualcomm Technologies, Inc.. Correspondence to: Markus Nagel <markusn@qti.qualcomm.com>, Rana Ali Amjad <ramjad@qti.qualcomm.com>, Tijmen Blankevoort <tijmen@qti.qualcomm.com>.
Pseudocode No The paper describes the steps of the Ada Round algorithm (e.g., 'To quantize the whole model, we optimize (21) layer-by-layer sequentially.'), but it does not include a formally labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper does not include an unambiguous statement that the authors are releasing source code for the described methodology, nor does it provide a direct link to a code repository.
Open Datasets Yes We use 1024 unlabeled images from the Image Net (Russakovsky et al., 2015) training set
Dataset Splits Yes We report the mean and standard deviation of the (top1) accuracy on the Image Net validation set, calculated using 5 runs with different initial seeds. To optimize Ada Round we use 1024 unlabeled images from the Image Net (Russakovsky et al., 2015) training set
Hardware Specification Yes It is worthwhile to note that the application of Ada Round to Resnet18 takes only 10 minutes on a single Nvidia GTX 1080 Ti.
Software Dependencies No The paper mentions 'Pytorch (Paszke et al., 2019)' as a software dependency, but it does not specify a version number for PyTorch or any other ancillary software components needed for replication.
Experiment Setup Yes To optimize Ada Round we use 1024 unlabeled images from the Image Net (Russakovsky et al., 2015) training set, Adam (Kingma & Ba, 2015) optimizer with default hyper-parameters for 10k iterations and a batch-size of 32, unless otherwise stated. We use symmetric 4-bit weight quantization...