Calibrated Stochastic Gradient Descent for Convolutional Neural Networks
Authors: Li’an Zhuo, Baochang Zhang, Chen Chen, Qixiang Ye, Jianzhuang Liu, David Doermann9348-9355
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that CNNs with our CSGD optimization scheme can improve the stateof-the-art performance for natural image classification, digit recognition, Image Net object classification, and object detection tasks. |
| Researcher Affiliation | Collaboration | 1School of Automation Science and Electrical Engineering, Beihang University, Beijing 2University of North Carolina at Charlotte, Charlotte, NC 3Department of Computer Science and Engineering University at Buffalo, Buffalo, NY 4Huawei Noah s Ark Lab 5University of Chinese Academy of Sciences, China |
| Pseudocode | Yes | Algorithm 1: The CSGD algorithm |
| Open Source Code | No | The paper provides a link "https://github.com/bczhangbczhang/" which is stated to be for a "technical report" related to a proof, not explicitly for the source code of the methodology itself. |
| Open Datasets | Yes | For the natural image classification task, we use the CIFAR10 and CIFAR-100 datasets (Krizhevsky 2009)... The Image Net (Deng et al. 2009) dataset... PASCAL VOC 2007 dataset. |
| Dataset Splits | Yes | The training procedure is terminated at 64k iterations, which is determined based on a 45k/5k train/validation split. |
| Hardware Specification | Yes | These models are trained on 4 GPUs (Titan XP) with a mini-batch size of 128. |
| Software Dependencies | No | The paper mentions that Faster R-CNN is implemented based on the "Caffe2 platform" but does not specify a version number for Caffe2 or any other software dependencies. |
| Experiment Setup | Yes | We use a weight decay of 0.0001 and momentum of 0.9... The learning rate is initialized as 0.1 and decreased to 1/10 of the previous size every 15 epochs... A learning rate of 0.004 for 12.5k mini-batches, and 0.0004 for the next 5k mini-batches, a momentum of 0.9 and a weight decay of 0.0005 are used. |