Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks
Authors: Alexander Shekhovtsov, Viktor Yanush, Boris Flach
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models with both proposed methods. |
| Researcher Affiliation | Academia | Alexander Shekhovtsov Czech Technical University in Prague EMAIL Viktor Yanush Lomonosov Moscow State University EMAIL Boris Flach Czech Technical University in Prague EMAIL |
| Pseudocode | Yes | Algorithm 1: Path Sample-Analytic (PSA) Algorithm 2: Straight-Through (ST) |
| Open Source Code | Yes | The implementation is available at https://github.com/shekhovt/PSA-Neurips2020. |
| Open Datasets | Yes | To test the proposed methods in a realistic learning setting we use CIFAR-10 dataset and network with 8 convolutional and 1 fully connected layers (Appendix C). |
| Dataset Splits | No | The paper mentions using validation accuracy but does not provide specific percentages or counts for the validation split in the main text or appendices. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions PyTorch but does not specify its version or the versions of any other software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | All models are trained with SGD with momentum (0.9) and a batch size of 256. We train for 2000 epochs, linearly decaying the learning rate to zero during the last 500 epochs. Each methodโs learning rate is ๏ฌne-tuned by an automated search on a log-uniform grid from 10^-5 to 0.1, choosing the highest learning rate that still yields a stable training (after 50 epochs). |