An In-depth Study of Stochastic Backpropagation
Authors: Jun Fang, Mingze Xu, Hao Chen, Bing Shuai, Zhuowen Tu, Joseph Tighe
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on image classification and object detection show that SBP can save up to 40% of GPU memory with less than 1% accuracy degradation. |
| Researcher Affiliation | Industry | Jun Fang Mingze Xu Hao Chen Bing Shuai Zhuowen Tu Joseph Tighe AWS AI Labs {junfa, xumingze, hxen, bshuai, ztu, tighej}@amazon.com |
| Pseudocode | Yes | Algorithm 1 Pytorch-like pseudocode of SBP for an arbitrary operation f. |
| Open Source Code | Yes | Code is available at: https://github.com/amazon-research/stochastic-backpropagation |
| Open Datasets | Yes | We evaluate the generalizability of our proposed SBP on two computer vision benchmarks: image classification on Image Net [18] and object detection on COCO [10]. |
| Dataset Splits | Yes | We use Vi T-Tiny [6] model to evaluate the top-1 accuracy on Image Net [18] validation dataset. |
| Hardware Specification | Yes | All experiments are conducted on machines with 8 Tesla 16GB V100. |
| Software Dependencies | No | The paper mentions 'Pytorch-like pseudocode' and uses PyTorch functions in Algorithm 1, but does not specify exact version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We keep the training hyper-parameters of optimizer, augmentation, regularization, batch size and learning rate the same for a given model, and only adopt different stochastic depth augmentation [9] for different model sizes. The Mixed Precision Training [17] method is enabled for faster training. We train 300 epochs for both these two networks by following the training recipe of Conv Ne Xt [14]. The experiments are trained for 100 epochs on the Image Net dataset. |