Calibrated Stochastic Gradient Descent for Convolutional Neural Networks

Authors: Li’an Zhuo, Baochang Zhang, Chen Chen, Qixiang Ye, Jianzhuang Liu, David Doermann9348-9355

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that CNNs with our CSGD optimization scheme can improve the stateof-the-art performance for natural image classification, digit recognition, Image Net object classification, and object detection tasks.
Researcher Affiliation Collaboration 1School of Automation Science and Electrical Engineering, Beihang University, Beijing 2University of North Carolina at Charlotte, Charlotte, NC 3Department of Computer Science and Engineering University at Buffalo, Buffalo, NY 4Huawei Noah s Ark Lab 5University of Chinese Academy of Sciences, China
Pseudocode Yes Algorithm 1: The CSGD algorithm
Open Source Code No The paper provides a link "https://github.com/bczhangbczhang/" which is stated to be for a "technical report" related to a proof, not explicitly for the source code of the methodology itself.
Open Datasets Yes For the natural image classification task, we use the CIFAR10 and CIFAR-100 datasets (Krizhevsky 2009)... The Image Net (Deng et al. 2009) dataset... PASCAL VOC 2007 dataset.
Dataset Splits Yes The training procedure is terminated at 64k iterations, which is determined based on a 45k/5k train/validation split.
Hardware Specification Yes These models are trained on 4 GPUs (Titan XP) with a mini-batch size of 128.
Software Dependencies No The paper mentions that Faster R-CNN is implemented based on the "Caffe2 platform" but does not specify a version number for Caffe2 or any other software dependencies.
Experiment Setup Yes We use a weight decay of 0.0001 and momentum of 0.9... The learning rate is initialized as 0.1 and decreased to 1/10 of the previous size every 15 epochs... A learning rate of 0.004 for 12.5k mini-batches, and 0.0004 for the next 5k mini-batches, a momentum of 0.9 and a weight decay of 0.0005 are used.