reproducibilityindex.ai

pbSGD: Powered Stochastic Gradient Descent Methods for Accelerated Non-Convex Optimization

Authors: Beitong Zhou, Jun Liu, Weigao Sun, Ruijuan Chen, Claire Tomlin, Ye Yuan

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The purpose of this section is to demonstrate the efﬁciency and effectiveness of the proposed pb SGD and pb SGDM algorithms. We conduct experiments of different model architectures on datasets in comparison with widely used optimization methods including the non-adaptive method SGDM and three popular adaptive methods: Ada Grad, RMSprop and Adam.
Researcher Affiliation	Academia	Beitong Zhou1 , Jun Liu2 , Weigao Sun1 , Ruijuan Chen1 , Claire Tomlin3 and Ye Yuan1 1School of Artiﬁcial Intelligence and Automation, Huazhong University of Science and Technology 2Department of Applied Mathematics, University of Waterloo 3Department of Electrical Engineering and Computer Sciences, UC Berkeley
Pseudocode	Yes	Pseudo-code of the proposed pb SGDM is detailed in Algorithm 2.
Open Source Code	No	The paper does not provide any explicit statements about releasing source code for the proposed methodology (pb SGD/pb SGDM) nor does it include a link to a code repository for their implementation. Footnotes link to third-party model implementations.
Open Datasets	Yes	We conduct experiments of different model architectures on datasets in comparison with widely used optimization methods... (CIFAR-10, CIFAR-100, Image Net, MNIST).
Dataset Splits	No	The paper specifies using a 'mini-batch size of 128 (except 256 in the Image Net experiment)' and discusses 'train' and 'test' accuracy. While hyperparameters are tuned, the paper does not explicitly detail a separate 'validation' dataset split or how it was used in the training process beyond what's implied by hyperparameter tuning.
Hardware Specification	No	The paper discusses training deep neural networks and running experiments but does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for these experiments.
Software Dependencies	No	The paper mentions 'pytorch-cifar' in footnotes for model architectures (e.g., https://github.com/kuangliu/pytorch-cifar), implying PyTorch might be used, but it does not specify any software dependencies with version numbers (e.g., 'PyTorch 1.x' or 'CUDA 11.x').
Experiment Setup	Yes	The setup for each experiment is detailed in Table 11. In the ﬁrst part, we present empirical study of different deep neural network architectures to see how the proposed methods behave in terms of convergence speed and generalization. ... For all experiments, we used a mini-batch size of 128 (except 256 in the Image Net experiment).