MetaMix: Meta-State Precision Searcher for Mixed-Precision Activation Quantization
Authors: Han-Byul Kim, Joo Hyung Lee, Sungjoo Yoo, Hong-Seok Kim
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments with efficient and hard-to-quantize networks, i.e., Mobile Net v2 and v3, and Res Net-18 on Image Net show that our proposed method pushes the boundary of mixed-precision quantization, in terms of accuracy vs. operations, by outperforming both mixedand single-precision SOTA methods. |
| Researcher Affiliation | Collaboration | Han-Byul Kim1,2 , Joo Hyung Lee2, Sungjoo Yoo1, Hong-Seok Kim2 1Department of Computer Science and Engineering, Seoul National University, Seoul, Korea 2Google, Mountain View, California, USA |
| Pseudocode | Yes | Algorithm 1: Pseudo code of Meta Mix (bit selection phase) |
| Open Source Code | No | The paper does not provide a link to open-source code or explicitly state that the code for the described methodology is publicly available. |
| Open Datasets | Yes | We evaluate the proposed method on Image Net-1K (Deng et al. 2009). |
| Dataset Splits | No | The paper mentions using the Image Net-1K dataset but does not explicitly provide details about specific training, validation, and test splits (e.g., percentages or sample counts) used for their experiments. While ImageNet has standard splits, the paper does not specify how they were used. |
| Hardware Specification | No | The paper does not specify the exact hardware used for running the experiments (e.g., specific GPU models, CPU types, or cloud computing instances). It only mentions "GPU hours" generally when discussing training cost. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies (e.g., programming languages, libraries, or frameworks) used in their experiments. |
| Experiment Setup | Yes | Table 1 shows the details of training in our proposed method. As the table shows, in the first epoch, we perform bit-meta training which learns full-precision weights and the step sizes of activations. Specifically, on each branch of bit-width (in Figure 4), the activation is quantized to its associated bitwidth while its associated step size is being trained. In the second epoch, we iterate bit-meta and bit-search training. In bit-search training, we fix the full-precision weights and step sizes obtained in the bit-meta training and learn only the architectural parameters for per-layer bit-width probabilities. After the per-layer bit-width is obtained, in the weight training phase, we fine-tune both network weights and the step sizes of weights and activations. |