Understanding Neural Network Binarization with Forward and Backward Proximal Quantizers

Authors: Yiwei Lu, Yaoliang Yu, Xinlin Li, Vahid Partovi Nia

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental conduct image classification experiments on CNNs and vision transformers, and empirically verify that BNN++ generally achieves competitive results on binarizing these models.
Researcher Affiliation Collaboration Yiwei Lu School of Computer Science University of Waterloo Vector Institute yiwei.lu@uwaterloo.ca Yaoliang Yu School of Computer Science University of Waterloo Vector Institute yaoliang.yu@uwaterloo.ca Xinlin Li Huawei Noah s Ark Lab xinlin.li1@huawei.com Vahid Partovi Nia Huawei Noah s Ark Lab vahid.partovinia@huawei.com
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks. Figure 1 shows a visualization of forward and backward passes, but it is not pseudocode.
Open Source Code No The paper does not include an explicit statement or link providing access to the open-source code for their proposed methodology.
Open Datasets Yes Datasets: We perform image classification on CIFAR-10/100 datasets [27] and Image Net-1K dataset [28].
Dataset Splits No The paper uses standard datasets (CIFAR-10/100, Image Net-1K) which implies standard splits, but it does not explicitly state the training/validation/test dataset splits (e.g., percentages or sample counts) within the paper.
Hardware Specification Yes Hardware and package: All experiments were run on a GPU cluster with NVIDIA V100 GPUs.
Software Dependencies No The paper mentions 'The platform we use is Py Torch. Specifically, we apply Vi T and Dei T models implemented in Pytorch Image Models (timm)', but it does not specify version numbers for PyTorch or timm.
Experiment Setup Yes Hyperparameters: We apply the same training hyperparameters and fine-tune/end-to-end training for 100/300 epochs across all models. For binarization methods: (1) PQ (Prox Quant): similar to Bai et al. [2], we apply the Linear Quantizer (LQ), see (10) in Appendix A, with initial ρ0 = 0.01 and linearly increase to ρT = 10; (2) r PC (reverse Prox Connect): we use the same LQ for r PC; (3) Prox Connect++: for PC, we apply the same LQ; for BNN+, we choose µ = 5 (no need to increase µ as the forward quantizer is sign); for BNN++, we choose µ0 = 5 and linearly increase to µT = 30 to achieve binarization at the final step.