Searching for Low-Bit Weights in Quantized Neural Networks
Authors: Zhaohui Yang, Yunhe Wang, Kai Han, Chunjing XU, Chao Xu, Dacheng Tao, Chang Xu
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on benchmarks demonstrate that the proposed method is able to produce quantized neural networks with higher performance over the state-of-the-art methods on both image classification and super-resolution tasks. The Py Torch code will be made available at https://github.com/ huawei-noah/Binary-Neural-Networks/tree/main/SLB and the Mind Spore code will be made available at https://www.mindspore.cn/ resources/hub. |
| Researcher Affiliation | Collaboration | 1 Key Lab of Machine Perception (MOE), Dept. of Machine Intelligence, Peking University. 2 Noah s Ark Lab, Huawei Technologies. 3 School of Computer Science, Faculty of Engineering, University of Sydney. |
| Pseudocode | Yes | Algorithm 1 Training algorithm of SLB |
| Open Source Code | No | The Py Torch code will be made available at https://github.com/ huawei-noah/Binary-Neural-Networks/tree/main/SLB and the Mind Spore code will be made available at https://www.mindspore.cn/ resources/hub. |
| Open Datasets | Yes | Following common practice in most works, we use the CIFAR-10 [37] and large scale ILSVRC2012 [9] recognition datasets to demonstrate the effectiveness of our method. we use the 291 images as in [55] for training and test on Set5 dataset [3]. |
| Dataset Splits | No | The paper mentions using CIFAR-10, ILSVRC2012, and Set5 datasets, which have standard splits, but it does not explicitly state the train/validation/test split percentages or sample counts used for its experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions PyTorch and MindSpore but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | We train the network for 500 epochs in total and decay the learning rate by a factor of 10 at 350, 440, and 475 epochs. The learning rate starts from 1e-3, weight decay is set to 0, and Adam optimizer is used to update parameters. We set Ts = 0.01 and Te = 10. For sin and linear schedulers, the accuracies converge rapidly. |