QLABGrad: A Hyperparameter-Free and Convergence-Guaranteed Scheme for Deep Learning

Authors: Minghan Fu, Fang-Xiang Wu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiment results on multiple architectures, including MLP, CNN, and Res Net, on MNIST, CIFAR10, and Image Net datasets, demonstrate that QLABGrad outperforms various competing schemes for deep learning.
Researcher Affiliation Academia Minghan Fu1, Fang-Xiang Wu1,2 1Department of Mechanical Engineering, University of Saskatchewan 2Department of Computer Science, University of Saskatchewan
Pseudocode Yes Algorithm 1: QLABGrad
Open Source Code No The paper does not provide a direct link to a source code repository or an explicit statement about the release of the code for the described methodology.
Open Datasets Yes To assess the effectiveness of our proposed QLABGrad, we carry out extensive experiments on three different datasets (MNIST (Le Cun et al. 1998), CIFAR10 (Krizhevsky, Hinton et al. 2009), and Tiny-Image Net (Le and Yang 2015)) with various models (multi-layer neural network, CNN, and Res Net-18), comparing to various popular competing schemes, including basic SGD, RMSProp, Ada Grad, and Adam.
Dataset Splits No The MNIST dataset (Le Cun et al. 1998) includes 60,000 training and 10,000 testing images. [...] The CIFAR-10 dataset (Krizhevsky, Hinton et al. 2009) comprises 50,000 training images and 10,000 test images... (Only training and testing splits are mentioned, no explicit validation split.)
Hardware Specification Yes All experiments are performed on a single RTX 3090 GPU.
Software Dependencies No The paper does not explicitly list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes The MNIST dataset (Le Cun et al. 1998) includes 60,000 training and 10,000 testing images. Figures 4 and 5 show the training loss for a multi-layer neural network (MLP) and a convolution neural network (CNN), respectively, with a mini-batch size of 64.