reproducibilityindex.ai

DoWG Unleashed: An Efficient Universal Parameter-Free Gradient Descent Method

Authors: Ahmed Khaled, Konstantin Mishchenko, Chi Jin

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To complement our theory, we also show empirically that Do WG trains at the edge of stability, and validate its effectiveness on practical machine learning tasks.
Researcher Affiliation	Collaboration	Ahmed Khaled Princeton University Konstantin Mishchenko Samsung AI Center Chi Jin Princeton University
Pseudocode	Yes	Algorithm 1: Do WG: Distance over Weighted Gradients
Open Source Code	Yes	implement1 Do WG on top of the Do G code2. 1https://github.com/rka97/dowg 2https://github.com/formll/dog
Open Datasets	Yes	We train the VGG11 (Simonyan and Zisserman, 2015) and Res Net-50 (He et al., 2016) neural network architectures on CIFAR10 (Krizhevsky, 2009) using Py Torch (Paszke et al., 2019)
Dataset Splits	No	The paper uses CIFAR10 and mentions 'Test accuracy' and 'Train accuracy/loss', but does not explicitly describe the dataset splits (e.g., percentages or counts for training, validation, and test sets).
Hardware Specification	Yes	on a single RTX3090 GPU
Software Dependencies	No	The paper mentions 'Py Torch (Paszke et al., 2019)' but does not specify the version number of PyTorch or any other software.
Experiment Setup	Yes	All methods are used with batch size 256 with no weight decay on a single RTX3090 GPU. We also add comparison against Adam (Kingma and Ba, 2015) with cosine annealing and the standard step size 10^-3.