reproducibilityindex.ai

FP8 Quantization: The Power of the Exponent

Authors: Andrey Kuzmin, Mart van Baalen, Yuwei Ren, Markus Nagel, Jorn Peters, Tijmen Blankevoort

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate the effect of the quantization formats on neural network quantization on three levels: 1) Analytically for several common data and weight distributions, 2) practically in INT8 and FP8 post-training quantization (PTQ) settings, and 3) in quantization-aware training (QAT) settings with both INT8 and different FP8 formats. We will show there is a strong agreement between our theoretical results and our practical results on real networks.
Researcher Affiliation	Industry	Qualcomm AI Research {akuzmin,mart,ren,markusn,jpeters,tijmen}@qti.qualcomm.com
Pseudocode	No	No pseudocode or algorithm blocks were found in the paper.
Open Source Code	No	1Code will be made available at https://github.com/Qualcomm-AI-research/FP8-quantization
Open Datasets	Yes	We experiment on Res Net18 [19], Mobile Net V2 [38], and Vi T [14] for Image Net classiﬁcation [37]; BERT-base [12] for language understanding on the GLUE benchmark [43]; HRNet [39] for semantic segmentation on the Cityscapes dataset [10]; Deep Lab V3 [7] for semantic segmentation on the Pascal VOC dataset [16]; and Salsa Next [11] for LIDAR point cloud segmentation on the Semantic KITTI dataset [2].
Dataset Splits	Yes	Following [35] we do not apply batch normalization folding, and re-estimate the batch normalization statistics (running mean and variance) before ﬁnal validation, as this improved results for every model we considered.
Hardware Specification	Yes	Our code is written in Py Torch and all our experiments are performed using NVIDIA Tesla V100 and A100 GPUs.
Software Dependencies	No	The paper states "Our code is written in Py Torch" but does not specify a version number or other software dependencies with versions.
Experiment Setup	Yes	We train our models for 20 epochs and use Adam for the model parameters and SGD for the quantization parameters. We run experiments with various learning rates for model and quantization parameters, as well as per-tensor and per-channel quantization, and report results for the best learning setup.