Uncertainty-aware Binary Neural Networks

Authors: Junhe Zhao, Linlin Yang, Baochang Zhang, Guodong Guo, David Doermann

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that our method improves multiple BNN methods by maintaining stability of training, and achieves a higher performance over prior arts. 4 Experiments In this part, we investigate the effectiveness of the proposed method on CIFAR-10/100 [Krizhevsky and others, 2009] and ILSVRC12 Image Net [Deng et al., 2009] datasets with the mainstream deep CNN architectures, including Res Net [He et al., 2016] and Wide Res Net [Zagoruyko and Komodakis, 2016]. Firstly, we elaborate the experiment setups in Section 4.1, including datasets, models, as well as hyperparameter settings. In section 4.2, a comprehensive comparison on both the CIFAR and the Image Net datasets in terms of accuracy is illustrated. Finally, we further analyze the effects of the proposed c-sign method during the training process in Section 4.3.
Researcher Affiliation Collaboration Junhe Zhao1 , Linlin Yang2 , Baochang Zhang1 , Guodong Guo3 and David Doermann4 1 Beihang University, Beijing, China 2University of Bonn, Germany 3Institute of Deep Learning, Baidu Research; National Engineering Laboratory for Deep Learning Technology and Application 4University at Buffalo, USA
Pseudocode Yes Algorithm 1 Uncertain aware Binary Neural Network
Open Source Code No The paper does not contain an explicit statement or a link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes In this part, we investigate the effectiveness of the proposed method on CIFAR-10/100 [Krizhevsky and others, 2009] and ILSVRC12 Image Net [Deng et al., 2009] datasets with the mainstream deep CNN architectures, including Res Net [He et al., 2016] and Wide Res Net [Zagoruyko and Komodakis, 2016].
Dataset Splits Yes The training set and testing set of CIFAR10/100 are composed of 50,000 pictures and 10,000 pictures, respectively, across the 10/100 classes. Moreover, ILSVRC12 Image Net is a more challenging and diverse dataset, which contains 1.2 million training images and 50,000 validation images across 1000 classes.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions optimizers like 'Adam optimizer' and 'SGD optimizer' but does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions or other libraries).
Experiment Setup Yes The learning rate is initially set to 0.001, with an Adam optimizer of momentum 0.9. A linear decay strategy is employed for the learning rate, which degrades the learning rate in a linear manner. The learning rate is initially set to 0.1, with an SGD optimizer of momentum 0.9, and we also apply a cosine annealing decay methods. As for the hyperparameters introduced in Ua BNN, m is set to 2. σ is influenced by the initialize method of corresponding parameters, and we set it equals to the variance of initialization parameters. For , which controls the stability of training and adaptively varies in the training, we test it from (0.05, 0.5), and set it to 0.1 for a relative rate for a better performance. For the CIFAR10/100 dataset, we employ Wide Res Net (WRN)... Data augmentation is applied during training. The images are padded with a size of 4 and are randomly divided into 32 32 windows for CIFAR10/100. On the Image Net dataset, we further evaluate the performance of our method on the Image Net dataset. Notably, we adopt two data augmentation methods in the training set: 1) cropping the image to the size of 224 224 at random locations, and 2) flipping the image horizontally. In the test set, we simply crop the image to 224 224 from the center.