Up or Down? Adaptive Rounding for Post-Training Quantization
Authors: Markus Nagel, Rana Ali Amjad, Mart Van Baalen, Christos Louizos, Tijmen Blankevoort
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate the performance of Ada Round, we conduct experiments on various computer vision tasks and models. In a comprehensive study, we show that Ada Round deļ¬nes a new state-of-the-art for post-training quantization on several networks and tasks, including Resnet18, Resnet50, Mobilenet V2, Inception V3 and Deeplab V3. |
| Researcher Affiliation | Industry | 1Qualcomm AI Research, an initiative of Qualcomm Technologies, Inc.. Correspondence to: Markus Nagel <markusn@qti.qualcomm.com>, Rana Ali Amjad <ramjad@qti.qualcomm.com>, Tijmen Blankevoort <tijmen@qti.qualcomm.com>. |
| Pseudocode | No | The paper describes the steps of the Ada Round algorithm (e.g., 'To quantize the whole model, we optimize (21) layer-by-layer sequentially.'), but it does not include a formally labeled 'Pseudocode' or 'Algorithm' block. |
| Open Source Code | No | The paper does not include an unambiguous statement that the authors are releasing source code for the described methodology, nor does it provide a direct link to a code repository. |
| Open Datasets | Yes | We use 1024 unlabeled images from the Image Net (Russakovsky et al., 2015) training set |
| Dataset Splits | Yes | We report the mean and standard deviation of the (top1) accuracy on the Image Net validation set, calculated using 5 runs with different initial seeds. To optimize Ada Round we use 1024 unlabeled images from the Image Net (Russakovsky et al., 2015) training set |
| Hardware Specification | Yes | It is worthwhile to note that the application of Ada Round to Resnet18 takes only 10 minutes on a single Nvidia GTX 1080 Ti. |
| Software Dependencies | No | The paper mentions 'Pytorch (Paszke et al., 2019)' as a software dependency, but it does not specify a version number for PyTorch or any other ancillary software components needed for replication. |
| Experiment Setup | Yes | To optimize Ada Round we use 1024 unlabeled images from the Image Net (Russakovsky et al., 2015) training set, Adam (Kingma & Ba, 2015) optimizer with default hyper-parameters for 10k iterations and a batch-size of 32, unless otherwise stated. We use symmetric 4-bit weight quantization... |