reproducibilityindex.ai

Generalization Bounds for Gradient Methods via Discrete and Continuous Prior

Authors: Xuanyuan Luo, Bei Luo, Jian Li

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct experiments for FGD and FSGD on MNIST [Le Cun et al., 1998] and CIFAR10 [Krizhevsky et al., 2009] to investigate the the optimization and generalization properties of FGD and FSGD, and the numerical closeness between our theoretical bounds and true test errors.
Researcher Affiliation	Academia	Xuanyuan Luo IIIS, Tsinghua University xuanyuanluo@google.com Luo Bei Renmin University of China rabbit_lb@ruc.edu.cn Jian Li IIIS, Tsinghua University lijian83@mail.tsinghua.edu
Pseudocode	Yes	Algorithm 1: Floored Gradient Descent (FGD) Input: Training dataset S = (z1, .., zn). Index set J. Result: Parameter WT 2 Rd. 1 Initialize W0 w0; 2 for t : 1 ! T do 3 g1 γtrf(Wt 1, S); 4 g2 γtrf(Wt 1, SJ); 5 Wt Wt 1 g2 "t ﬂoor((g1 g2)/"t);
Open Source Code	No	The paper mentions providing a code in supplementary material in the checklist, but the actual PDF provided does not contain a link or specific instruction to access it. The checklist states: "Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]" However, the main paper PDF does not provide this URL or clear access information within its content.
Open Datasets	Yes	In this section, we conduct experiments for FGD and FSGD on MNIST [Le Cun et al., 1998] and CIFAR10 [Krizhevsky et al., 2009] to investigate the the optimization and generalization properties of FGD and FSGD, and the numerical closeness between our theoretical bounds and true test errors.
Dataset Splits	No	The paper does not explicitly provide percentages or counts for training, validation, and test splits. It uses 'standard datasets' where such splits are common knowledge but does not detail them within the paper itself for reproducibility of the specific splits used.
Hardware Specification	No	The paper does not explicitly describe the hardware used for experiments in the main text. The checklist states: "Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [Yes] It can be found in our supplemental material." However, this information is not in the provided paper PDF.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers for reproducibility. It states: "The code and the data are proprietary" in the checklist, and mentions some related software/libraries in the text (e.g., PyTorch, TensorFlow) but without version details for its own implementation.
Experiment Setup	Yes	For MNIST, we train a CNN (d = 1.4 106) by FGD with γt = 0.005 0.9b t 150 c and "t = 0.005 and momentum = 0.9). The size m = \|J\| is set to n/2 = 30000. ... For CIFAR10, we train a Simple Net [Hasanpour et al., 2016] without Batch Norm and Dropout. The number of parameters d is nearly 18 106. We use FSGD to train our model. The learning rate γt is set to 0.001 0.9bt/200c, the precision "t is set to 0.004, and the momentum is set to 0.99. The batch size is 2000. m = \|J\| is set to n/5 = 10000.