BiMLP: Compact Binary Architectures for Vision Multi-Layer Perceptrons
Authors: Yixing Xu, Xinghao Chen, Yunhe Wang
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on benchmark dataset Image Net-1k demonstrate the effectiveness of the proposed Bi MLP models, which achieve state-of-the-art accuracy compared to prior binary CNNs. |
| Researcher Affiliation | Industry | Yixing Xu, Xinghao Chen, Yunhe Wang Huawei Noah s Ark Lab {yixing.xu, xinghao.chen, yunhe.wang}@huawei.com |
| Pseudocode | No | The paper describes methods and architectures but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The Mind Spore code is available at https://gitee.com/mindspore/ models/tree/master/research/cv/Bi MLP. |
| Open Datasets | Yes | The benchmark dataset Image Net-1k [36] contains over 1.2M training images and 50k validation images from 1,000 different categories. |
| Dataset Splits | Yes | The benchmark dataset Image Net-1k [36] contains over 1.2M training images and 50k validation images from 1,000 different categories. |
| Hardware Specification | Yes | We use NVIDIA V100 GPUs with a total batchsize of 1024 to train the model with Mindspore [17]. |
| Software Dependencies | No | The paper mentions using 'Mindspore [17]' for training but does not provide specific version numbers for MindSpore or any other software libraries. |
| Experiment Setup | Yes | In both steps, the student models are trained for 300 epochs using Adam W [28] optimizer with momentum of 0.9 and weight decay of 0.05. We start with the learning rate of 1 10 3 and a cosine learning rate decay scheduler is used during training. We use NVIDIA V100 GPUs with a total batchsize of 1024 to train the model with Mindspore [17]. The commonly used data-augmentation strategies such as Cut-Mix [51], Mixup [52] and Rand-Augment [5] are used. |