StreamNet: Memory-Efficient Streaming Tiny Deep Learning Inference on the Microcontroller
Authors: Hong-Sheng Zheng, Yu-Yuan Liu, Chen-Fong Hsu, Tsung Tai Yeh
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In 10 Tiny ML models, Stream Net-2D achieves a geometric mean of 7.3X speedup and saves 81% of MACs over the state-of-the-art patch-based inference. |
| Researcher Affiliation | Academia | Hong-Sheng Zheng, Chen-Fong Hsu, Yu-Yuan Liu, Tsung Tai Yeh Department of Computer Science National Yang-Ming Chiao Tung University Hsinchu, Taiwan {hszheng.cs08, fonghsu.cs08, yyliu.cs11, ttyeh14}@nycu.edu.tw |
| Pseudocode | Yes | Algorithm 1: The Stream Net Parameter Selection Algorithm |
| Open Source Code | No | The paper describes its implementation and modifications to MCUNet V2 but does not provide an explicit statement or link for the open-sourcing of Stream Net's specific code. |
| Open Datasets | No | We evaluate 10 Tiny ML models from MCUNet V2 model zoo (5)such as mbv2w0.35(MB2), proxyless-w0.3(PL), mcunet-vww0(MV0), mcunet-vww1(MV1), mcunet-vww2(MV2), mcunet-in0(MI0), mcunet-in1(MI1), mcunet-in2(MI2), mcunet-in3(MI3), mcunet-in4(MI4). All models use int8 quantized mode. |
| Dataset Splits | No | The paper evaluates pre-trained Tiny ML models from a model zoo and does not discuss or specify training, validation, or test dataset splits used for training these models or for its own evaluation. |
| Hardware Specification | Yes | The MCU used in our evaluation is STM32F767ZI (8) that includes an ARM Cortex-M7 CPU at 216 MHz, a 512KB SRAM, and a 2MB Flash |
| Software Dependencies | No | We deploy Tiny ML models through ARM Mbed CLI (7) on the ARM Cortex-M7 CPU. ... Stream Net translates tensors and operators of the patch-based inference into C source code and links kernel libraries such as ARM CMSIS-NN (6), Tiny Engine (1; 2), and our bypass padding kernels. |
| Experiment Setup | No | The paper describes the system implementation and benchmark models but does not provide specific hyperparameters or system-level training settings such as learning rate, batch size, or optimizer details for the models themselves, as it evaluates pre-trained models. |