reproducibilityindex.ai

Analyzing the Generalization Capability of SGLD Using Properties of Gaussian Channels

Authors: Hao Wang, Yizhe Huang, Rui Gao, Flavio Calmon

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate our bound through numerical experiments, showing that it can predict the behavior of the true generalization gap. We demonstrate our generalization bound in Theorem 1 through numerical experiments on the MNIST dataset.
Researcher Affiliation	Academia	Hao Wang Harvard University hao_wang@g.harvard.edu Yizhe Huang The University of Texas at Austin yizhehuang@utexas.edu Rui Gao The University of Texas at Austin rui.gao@mccombs.utexas.edu Flavio P. Calmon Harvard University flavio@seas.harvard.edu
Pseudocode	No	The paper provides mathematical equations for the SGLD and DP-SGD algorithms but does not include structured pseudocode or an algorithm block.
Open Source Code	No	The paper mentions that SGLD "has been implemented in open-source libraries (Facebook AI, 2020; Radebaugh and Erlingsson, 2019)" but does not state that the authors are releasing their own source code for the methodology described in this paper.
Open Datasets	Yes	We demonstrate our generalization bound in Theorem 1 through numerical experiments on the MNIST dataset. We reproduce our experiments on the CIFAR-10 dataset (Krizhevsky et al., 2009) in the supplementary material.
Dataset Splits	No	The paper mentions using a "training dataset" of 5000 samples and evaluating the "generalization gap" (implying unseen data), but it does not provide specific details about train/validation/test splits (e.g., percentages or exact counts) needed for reproduction. It only states the size of the overall training set.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to conduct the experiments.
Software Dependencies	No	The paper mentions using Facebook AI (Opacus) and Radebaugh and Erlingsson (TensorFlow Privacy) but does not provide specific version numbers for any software dependencies used in their experiments.
Experiment Setup	No	The paper describes some general aspects of the experimental setup, such as training 3-layer networks on MNIST with varying widths and depths, and adjusting label corruption levels. It also mentions training until 1.0 accuracy or for 400 epochs. However, it lacks concrete details like specific learning rates, batch sizes, optimizer types, or other hyperparameters and configuration settings that would be necessary for reproduction.