BiFSMN: Binary Neural Network for Keyword Spotting

Authors: Haotong Qin, Xudong Ma, Yifu Ding, Xiaoyang Li, Yang Zhang, Yao Tian, Zejun Ma, Jie Luo, Xianglong Liu

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that Bi FSMN outperforms existing binarization methods by convincing margins on various datasets and is even comparable with the full-precision counterpart. We highlight that benefiting from the thinnable architecture and the optimized 1-bit implementation, Bi FSMN can achieve an impressive 22.3 speedup and 15.5 storage-saving on real-world edge hardware.
Researcher Affiliation Collaboration 1Beihang University 2Bytedance AI Lab
Pseudocode Yes Algorithm 1: The training process of our Bi FSMN.
Open Source Code No The paper does not provide a statement or link indicating that the source code for their methodology is publicly available.
Open Datasets Yes Extensive experiments on Google Speech Commands V1 and V2 datasets [Warden, 2018] to verify the effectiveness of Bi FSMN and compare it with state-of-the-art (SOTA) binarization methods and various architectures.
Dataset Splits No The paper mentions using 'Google Speech Commands V1 and V2 datasets' but does not explicitly state the training, validation, and test splits (e.g., percentages or sample counts) or explicitly state that standard splits were used from the cited reference.
Hardware Specification Yes To validate the practicability of Bi FSMN, we test the actual speed of Bi FSMN on Raspberry Pi 3B+ with 1.2GHz 64-bit ARMv8 CPU Cortex-A53.
Software Dependencies No The paper mentions existing binarization frameworks like 'da BNN' and 'Bolt', but does not specify the software dependencies or their version numbers used for their own implementation.
Experiment Setup Yes In the binarized network, both weights and activations are compressed to 1-bit using the sign function in the forward propagation, and the STE [Courbariaux et al., 2015] is applied to clip the gradient in the backward propagation... f( ) = BN Nonlinear( ) denotes the composition of batch normalization and nonlinear functions (PRe LU in the binarized network [Martinez et al., 2020])... γ is a hyperparameter to control distillation impact, set to 0.01 as default. The detailed training procedures for the Bi FSMN are listed in Algorithm 1.