From Hashing to CNNs: Training Binary Weight Networks via Hashing

Authors: Qinghao Hu, Peisong Wang, Jian Cheng

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on CIFAR10, CIFAR100 and Image Net demonstrate that our proposed BWNH outperforms current state-of-art by a large margin.
Researcher Affiliation Academia Qinghao Hu, Peisong Wang, Jian Cheng Institute of Automation, Chinese Academy of Sciences, Beijing, China University of Chinese Academy of Sciences, Beijing, China Center for Excellence in Brain Science and Intelligence Technology, CAS, Beijing, China
Pseudocode Yes Algorithm 1: Training Binary weight Convolutional Neural Networks via Hashing
Open Source Code No The paper does not provide any statement or link indicating the availability of open-source code for the described methodology.
Open Datasets Yes To evaluate our proposed method, we conduct extensive experiments on three public benchmark datasets including CIFAR10, CIFAR100, and Image Net. ... Image Net dataset (ILSVRC2012) has about 1.2M training images from 1000 classes and 50,000 validation images.
Dataset Splits Yes CIFAR10 dataset consists of 60,000 colour images in 10 classes. Each class contains 6000 images in size 32 32. There are 5000 training images and 1000 testing images per class. ... Image Net dataset (ILSVRC2012) has about 1.2M training images from 1000 classes and 50,000 validation images.
Hardware Specification Yes All experiments are conducted on a GPU Server which has 8 Nvidia Titan Xp GPUs.
Software Dependencies No The paper mentions using "Caffe framework" and "CUDA" but does not specify version numbers for these software components.
Experiment Setup Yes We implement our proposed method based on the Caffe framework... We adopt different fine-tuning settings for different network architecture. Alex Net We fine-tune Alex Net using a SGD solver with momentum=0.9, weight decay=0.0005. The learning rate starts at 0.001, and is divided by 10 after 100k, 150k, and 180k iterations. The network is fine-tuned for 200k iterations with batch-size equals to 256.