Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks
Authors: Alexander Shekhovtsov, Viktor Yanush, Boris Flach
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models with both proposed methods. |
| Researcher Affiliation | Academia | Alexander Shekhovtsov Czech Technical University in Prague shekhovt@cmp.felk.cvut.cz Viktor Yanush Lomonosov Moscow State University yanushviktor@gmail.com Boris Flach Czech Technical University in Prague flachbor@cmp.felk.cvut.cz |
| Pseudocode | Yes | Algorithm 1: Path Sample-Analytic (PSA) Algorithm 2: Straight-Through (ST) |
| Open Source Code | Yes | The implementation is available at https://github.com/shekhovt/PSA-Neurips2020. |
| Open Datasets | Yes | To test the proposed methods in a realistic learning setting we use CIFAR-10 dataset and network with 8 convolutional and 1 fully connected layers (Appendix C). |
| Dataset Splits | No | The paper mentions using validation accuracy but does not provide specific percentages or counts for the validation split in the main text or appendices. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions PyTorch but does not specify its version or the versions of any other software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | All models are trained with SGD with momentum (0.9) and a batch size of 256. We train for 2000 epochs, linearly decaying the learning rate to zero during the last 500 epochs. Each method’s learning rate is fine-tuned by an automated search on a log-uniform grid from 10^-5 to 0.1, choosing the highest learning rate that still yields a stable training (after 50 epochs). |