reproducibilityindex.ai

BitsFusion: 1.99 bits Weight Quantization of Diffusion Model

Authors: Yang Sui, Yanyu Li, Anil Kag, Yerlan Idelbayev, Junli Cao, Ju Hu, Dhritiman Sagar, Bo Yuan, Sergey Tulyakov, Jian Ren

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our approach includes several novel techniques, such as assigning optimal bits to each layer, initializing the quantized model for better performance, and improving the training strategy to dramatically reduce quantization error. Furthermore, we extensively evaluate our quantized model across various benchmark datasets and through human evaluation to demonstrate its superior generation quality. [...] 5 Experiments
Researcher Affiliation	Collaboration	1Snap Inc. 2Rutgers University
Pseudocode	Yes	We provide the detailed algorithm as outlined in Alg. 1.
Open Source Code	Yes	Project Page: https://snap-research.github.io/Bits Fusion [...] We plan to release our code and trained models to facilitate the research efforts towards extreme low-bits quantization.
Open Datasets	Yes	We include results on various benchmark datasets, i.e., TIFA [25], Gen Eval [13], CLIP score [66] and FID [19] on MS-COCO 2014 validation set [46]. Additionally, we perform human evaluation on Parti Prompts [86].
Dataset Splits	Yes	We perform QAT over each candidate on a pre-defined training sub dataset, and validate the incurred quantization error of each candidate by comparing it against the full-precision model (more details in App. B).
Hardware Specification	Yes	For Stage-I, we use 8 NVIDIA A100 GPUs with a total batch size of 256 to train the quantized model for 20K iterations. For Stage-II, we use 32 NVIDIA A100 GPUs with a total batch size of 1024 to train the quantized model for 50K iterations.
Software Dependencies	No	The paper mentions using the 'diffusers library' and 'Adam W optimizer' but does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	We develop our code using diffusers library2 and train the models with Adam W optimizer [33] and a constant learning rate as 1e 05 on an internal dataset. For Stage-I, we use 8 NVIDIA A100 GPUs with a total batch size of 256 to train the quantized model for 20K iterations. For Stage-II, we use 32 NVIDIA A100 GPUs with a total batch size of 1024 to train the quantized model for 50K iterations. During inference, we adopt the PNDM scheduler [49] with 50 sampling steps to generate images for comparison.