Memory-Optimal Direct Convolutions for Maximizing Classification Accuracy in Embedded Applications

Authors: Albert Gural, Boris Murmann

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate the memory-optimal CNN technique with an Arduino implementation of the 10-class MNIST classification task, fitting the network specification, weights, and activations entirely within 2KB SRAM and achieving a state-of-the-art classification accuracy for small-scale embedded systems of 99.15%.
Researcher Affiliation Academia Albert Gural 1 Boris Murmann 1 1Department of Electrical Engineering, Stanford University, Stanford, USA.
Pseudocode No The paper describes algorithmic strategies but does not include any formal pseudocode or algorithm blocks.
Open Source Code Yes Code and supplemental material are available here.
Open Datasets Yes a test accuracy of 99.15% is achievable for the original MNIST-10 dataset from Le Cun et al. (1998).
Dataset Splits No A network architecture search is performed to find the best quantized network by validation performance. ... for all 10,000 test images to ensure a 100% match. (No explicit train/validation split percentages or counts are provided).
Hardware Specification Yes A single Arduino based on the ATmega328P chip is employed for the classification task.
Software Dependencies No The network is trained in Keras/Tensor Flow with Adam (Abadi et al., 2015; Chollet et al., 2015; Kingma & Ba, 2014) (No specific version numbers are provided for the software components).
Experiment Setup Yes The network is trained in Keras/Tensor Flow with Adam (Abadi et al., 2015; Chollet et al., 2015; Kingma & Ba, 2014) for 50 epochs in floating point and 200 epochs with 4-bit quantized training using the straight-through estimator and ALT training (Courbariaux et al., 2016; Jain et al., 2019).