BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons

Authors: Yixing Xu, Xinghao Chen, Yunhe Wang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on benchmark dataset Image Net-1k demonstrate the effectiveness of the proposed Bi MLP models, which achieve state-of-the-art accuracy compared to prior binary CNNs.
Researcher Affiliation Industry Yixing Xu, Xinghao Chen, Yunhe Wang Huawei Noah s Ark Lab {yixing.xu, xinghao.chen, yunhe.wang}@huawei.com
Pseudocode No The paper describes methods and architectures but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The Mind Spore code is available at https://gitee.com/mindspore/ models/tree/master/research/cv/Bi MLP.
Open Datasets Yes The benchmark dataset Image Net-1k [36] contains over 1.2M training images and 50k validation images from 1,000 different categories.
Dataset Splits Yes The benchmark dataset Image Net-1k [36] contains over 1.2M training images and 50k validation images from 1,000 different categories.
Hardware Specification Yes We use NVIDIA V100 GPUs with a total batchsize of 1024 to train the model with Mindspore [17].
Software Dependencies No The paper mentions using 'Mindspore [17]' for training but does not provide specific version numbers for MindSpore or any other software libraries.
Experiment Setup Yes In both steps, the student models are trained for 300 epochs using Adam W [28] optimizer with momentum of 0.9 and weight decay of 0.05. We start with the learning rate of 1 10 3 and a cosine learning rate decay scheduler is used during training. We use NVIDIA V100 GPUs with a total batchsize of 1024 to train the model with Mindspore [17]. The commonly used data-augmentation strategies such as Cut-Mix [51], Mixup [52] and Rand-Augment [5] are used.