Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
I-BERT: Integer-only BERT Quantization
Authors: Sehoon Kim, Amir Gholami, Zhewei Yao, Michael W. Mahoney, Kurt Keutzer
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach on GLUE downstream tasks using RoBERTa Base/Large. We show that for both cases, I-BERT achieves similar (and slightly higher) accuracy as compared to the full-precision baseline. Furthermore, our preliminary implementation of I-BERT shows a speedup of 2.4-4.0 for INT8 inference on a T4 GPU system as compared to FP32 inference. |
| Researcher Affiliation | Academia | University of California, Berkeley. |
| Pseudocode | Yes | Algorithm 1 Integer-only Computation of Second-order Polynomial a(x + b)2 + c, Algorithm 2 Integer-only GELU, Algorithm 3 Integer-only Exponential and Softmax, Algorithm 4 Integer-only Square Root |
| Open Source Code | Yes | The framework has been developed in Py Torch and has been open-sourced (Kim, 2021). |
| Open Datasets | Yes | We evaluate our approach on GLUE downstream tasks using RoBERTa Base/Large. |
| Dataset Splits | Yes | For each of the GLUE downstream tasks, we train both FP32 baseline and integer-only I-BERT models, and evaluate the accuracy on the development set. |
| Hardware Specification | Yes | Furthermore, our preliminary implementation of I-BERT shows a speedup of 2.4-4.0 for INT8 inference on a T4 GPU system as compared to FP32 inference. |
| Software Dependencies | No | The framework has been developed in Py Torch and has been open-sourced (Kim, 2021). Specific version numbers for PyTorch and TensorRT are not provided. |
| Experiment Setup | Yes | See Appendix C.2 and C.3 for training and evaluation details. |