Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
An In-depth Study of Stochastic Backpropagation
Authors: Jun Fang, Mingze Xu, Hao Chen, Bing Shuai, Zhuowen Tu, Joseph Tighe
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on image classification and object detection show that SBP can save up to 40% of GPU memory with less than 1% accuracy degradation. |
| Researcher Affiliation | Industry | Jun Fang Mingze Xu Hao Chen Bing Shuai Zhuowen Tu Joseph Tighe AWS AI Labs EMAIL |
| Pseudocode | Yes | Algorithm 1 Pytorch-like pseudocode of SBP for an arbitrary operation f. |
| Open Source Code | Yes | Code is available at: https://github.com/amazon-research/stochastic-backpropagation |
| Open Datasets | Yes | We evaluate the generalizability of our proposed SBP on two computer vision benchmarks: image classification on Image Net [18] and object detection on COCO [10]. |
| Dataset Splits | Yes | We use Vi T-Tiny [6] model to evaluate the top-1 accuracy on Image Net [18] validation dataset. |
| Hardware Specification | Yes | All experiments are conducted on machines with 8 Tesla 16GB V100. |
| Software Dependencies | No | The paper mentions 'Pytorch-like pseudocode' and uses PyTorch functions in Algorithm 1, but does not specify exact version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We keep the training hyper-parameters of optimizer, augmentation, regularization, batch size and learning rate the same for a given model, and only adopt different stochastic depth augmentation [9] for different model sizes. The Mixed Precision Training [17] method is enabled for faster training. We train 300 epochs for both these two networks by following the training recipe of Conv Ne Xt [14]. The experiments are trained for 100 epochs on the Image Net dataset. |