MetaMix: Meta-State Precision Searcher for Mixed-Precision Activation Quantization

Authors: Han-Byul Kim, Joo Hyung Lee, Sungjoo Yoo, Hong-Seok Kim

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments with efficient and hard-to-quantize networks, i.e., Mobile Net v2 and v3, and Res Net-18 on Image Net show that our proposed method pushes the boundary of mixed-precision quantization, in terms of accuracy vs. operations, by outperforming both mixedand single-precision SOTA methods.
Researcher Affiliation Collaboration Han-Byul Kim1,2 , Joo Hyung Lee2, Sungjoo Yoo1, Hong-Seok Kim2 1Department of Computer Science and Engineering, Seoul National University, Seoul, Korea 2Google, Mountain View, California, USA
Pseudocode Yes Algorithm 1: Pseudo code of Meta Mix (bit selection phase)
Open Source Code No The paper does not provide a link to open-source code or explicitly state that the code for the described methodology is publicly available.
Open Datasets Yes We evaluate the proposed method on Image Net-1K (Deng et al. 2009).
Dataset Splits No The paper mentions using the Image Net-1K dataset but does not explicitly provide details about specific training, validation, and test splits (e.g., percentages or sample counts) used for their experiments. While ImageNet has standard splits, the paper does not specify how they were used.
Hardware Specification No The paper does not specify the exact hardware used for running the experiments (e.g., specific GPU models, CPU types, or cloud computing instances). It only mentions "GPU hours" generally when discussing training cost.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies (e.g., programming languages, libraries, or frameworks) used in their experiments.
Experiment Setup Yes Table 1 shows the details of training in our proposed method. As the table shows, in the first epoch, we perform bit-meta training which learns full-precision weights and the step sizes of activations. Specifically, on each branch of bit-width (in Figure 4), the activation is quantized to its associated bitwidth while its associated step size is being trained. In the second epoch, we iterate bit-meta and bit-search training. In bit-search training, we fix the full-precision weights and step sizes obtained in the bit-meta training and learn only the architectural parameters for per-layer bit-width probabilities. After the per-layer bit-width is obtained, in the weight training phase, we fine-tune both network weights and the step sizes of weights and activations.