Faster Neural Network Training with Approximate Tensor Operations

Authors: Menachem Adelman, Kfir Levy, Ido Hakimi, Mark Silberstein

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply approximate tensor operations to single and multi-node training of MLP and CNN networks on MNIST, CIFAR-10 and Image Net datasets. We demonstrate up to 66% reduction in the amount of computations and communication, and up to 1.37x faster training time while maintaining negligible or no impact on the final test accuracy.
Researcher Affiliation Collaboration Menachem Adelman Intel & Technion adelman.menachem@gmail.com Kfir Y. Levy Technion kfirylevy@technion.ac.il Ido Hakimi Technion idohakimi@gmail.com Mark Silberstein Technion mark@ee.technion.ac.il
Pseudocode No The provided text does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes 2https://github.com/acsl-technion/approx
Open Datasets Yes We evaluate our approximate training technique on several network architectures and datasets: MLP and CNN on MNIST [28], Wide Res Net 28-10 [29] on CIFAR-10 [30], and Res Net-50 and Res Net-152 [31] on Image Net [32].
Dataset Splits Yes We evaluate our approximate training technique on several network architectures and datasets: MLP and CNN on MNIST [28], Wide Res Net 28-10 [29] on CIFAR-10 [30], and Res Net-50 and Res Net-152 [31] on Image Net [32]. We apply approximations only during training, and use exact computations for validation/test evaluation.
Hardware Specification Yes We train the networks on a single node using NVidia V100 GPUs (two GPUs for Res Net-152, one for the rest)
Software Dependencies No We implement our techniques in Py Torch [27]... The paper mentions Py Torch but does not specify a version number for it or any other software dependency.
Experiment Setup No The paper states that training was done 'without changing training hyper-parameters' but does not explicitly list or detail these parameters (e.g., learning rate, batch size, epochs, optimizer settings) within the provided text.