Memory-Optimal Direct Convolutions for Maximizing Classification Accuracy in Embedded Applications
Authors: Albert Gural, Boris Murmann
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the memory-optimal CNN technique with an Arduino implementation of the 10-class MNIST classification task, fitting the network specification, weights, and activations entirely within 2KB SRAM and achieving a state-of-the-art classification accuracy for small-scale embedded systems of 99.15%. |
| Researcher Affiliation | Academia | Albert Gural 1 Boris Murmann 1 1Department of Electrical Engineering, Stanford University, Stanford, USA. |
| Pseudocode | No | The paper describes algorithmic strategies but does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code and supplemental material are available here. |
| Open Datasets | Yes | a test accuracy of 99.15% is achievable for the original MNIST-10 dataset from Le Cun et al. (1998). |
| Dataset Splits | No | A network architecture search is performed to find the best quantized network by validation performance. ... for all 10,000 test images to ensure a 100% match. (No explicit train/validation split percentages or counts are provided). |
| Hardware Specification | Yes | A single Arduino based on the ATmega328P chip is employed for the classification task. |
| Software Dependencies | No | The network is trained in Keras/Tensor Flow with Adam (Abadi et al., 2015; Chollet et al., 2015; Kingma & Ba, 2014) (No specific version numbers are provided for the software components). |
| Experiment Setup | Yes | The network is trained in Keras/Tensor Flow with Adam (Abadi et al., 2015; Chollet et al., 2015; Kingma & Ba, 2014) for 50 epochs in floating point and 200 epochs with 4-bit quantized training using the straight-through estimator and ALT training (Courbariaux et al., 2016; Jain et al., 2019). |