An In-depth Study of Stochastic Backpropagation

Authors: Jun Fang, Mingze Xu, Hao Chen, Bing Shuai, Zhuowen Tu, Joseph Tighe

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on image classification and object detection show that SBP can save up to 40% of GPU memory with less than 1% accuracy degradation.
Researcher Affiliation Industry Jun Fang Mingze Xu Hao Chen Bing Shuai Zhuowen Tu Joseph Tighe AWS AI Labs {junfa, xumingze, hxen, bshuai, ztu, tighej}@amazon.com
Pseudocode Yes Algorithm 1 Pytorch-like pseudocode of SBP for an arbitrary operation f.
Open Source Code Yes Code is available at: https://github.com/amazon-research/stochastic-backpropagation
Open Datasets Yes We evaluate the generalizability of our proposed SBP on two computer vision benchmarks: image classification on Image Net [18] and object detection on COCO [10].
Dataset Splits Yes We use Vi T-Tiny [6] model to evaluate the top-1 accuracy on Image Net [18] validation dataset.
Hardware Specification Yes All experiments are conducted on machines with 8 Tesla 16GB V100.
Software Dependencies No The paper mentions 'Pytorch-like pseudocode' and uses PyTorch functions in Algorithm 1, but does not specify exact version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes We keep the training hyper-parameters of optimizer, augmentation, regularization, batch size and learning rate the same for a given model, and only adopt different stochastic depth augmentation [9] for different model sizes. The Mixed Precision Training [17] method is enabled for faster training. We train 300 epochs for both these two networks by following the training recipe of Conv Ne Xt [14]. The experiments are trained for 100 epochs on the Image Net dataset.