StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller

Authors: Hong-Sheng Zheng, Yu-Yuan Liu, Chen-Fong Hsu, Tsung Tai Yeh

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In 10 Tiny ML models, Stream Net-2D achieves a geometric mean of 7.3X speedup and saves 81% of MACs over the state-of-the-art patch-based inference.
Researcher Affiliation Academia Hong-Sheng Zheng, Chen-Fong Hsu, Yu-Yuan Liu, Tsung Tai Yeh Department of Computer Science National Yang-Ming Chiao Tung University Hsinchu, Taiwan {hszheng.cs08, fonghsu.cs08, yyliu.cs11, ttyeh14}@nycu.edu.tw
Pseudocode Yes Algorithm 1: The Stream Net Parameter Selection Algorithm
Open Source Code No The paper describes its implementation and modifications to MCUNet V2 but does not provide an explicit statement or link for the open-sourcing of Stream Net's specific code.
Open Datasets No We evaluate 10 Tiny ML models from MCUNet V2 model zoo (5)such as mbv2w0.35(MB2), proxyless-w0.3(PL), mcunet-vww0(MV0), mcunet-vww1(MV1), mcunet-vww2(MV2), mcunet-in0(MI0), mcunet-in1(MI1), mcunet-in2(MI2), mcunet-in3(MI3), mcunet-in4(MI4). All models use int8 quantized mode.
Dataset Splits No The paper evaluates pre-trained Tiny ML models from a model zoo and does not discuss or specify training, validation, or test dataset splits used for training these models or for its own evaluation.
Hardware Specification Yes The MCU used in our evaluation is STM32F767ZI (8) that includes an ARM Cortex-M7 CPU at 216 MHz, a 512KB SRAM, and a 2MB Flash
Software Dependencies No We deploy Tiny ML models through ARM Mbed CLI (7) on the ARM Cortex-M7 CPU. ... Stream Net translates tensors and operators of the patch-based inference into C source code and links kernel libraries such as ARM CMSIS-NN (6), Tiny Engine (1; 2), and our bypass padding kernels.
Experiment Setup No The paper describes the system implementation and benchmark models but does not provide specific hyperparameters or system-level training settings such as learning rate, batch size, or optimizer details for the models themselves, as it evaluates pre-trained models.