Stochastic Batch Augmentation with An Effective Distilled Dynamic Soft Label Regularizer
Authors: Qian Li, Qingyuan Hu, Yong Qi, Saiyu Qi, Jie Ma, Jian Zhang
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed SBA is empirically verified on several benchmark image classification datasets including CIFAR-10, CIFAR-100 and ImageNet. Extensive experiments show that our SBA-DSLR outperforms other state-of-the-art data augmentation methods. |
| Researcher Affiliation | Academia | Qing Yu1, Handong Li1, Weifeng Ge1, Min Yang2, Yiheng Zhou1, Huitong Qu1, Guodong Li1, Junyu Han1, Jingdong Chen1, Errui Ding1 1Baidu Inc., China 2South China University of Technology, China |
| Pseudocode | Yes | Algorithm 1 Training with Stochastic Batch Augmentation and DSLR |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code availability. |
| Open Datasets | Yes | The proposed SBA is empirically verified on several benchmark image classification datasets including CIFAR-10, CIFAR-100 and ImageNet. |
| Dataset Splits | Yes | For CIFAR-10 and CIFAR-100, we use the standard training/testing split (50,000 training images, 10,000 testing images) for fair comparison. For ImageNet, we use the standard 1.28 million training images and 50,000 validation images. |
| Hardware Specification | Yes | Experiments were conducted on NVIDIA Tesla V100 GPUs. |
| Software Dependencies | No | The paper states that the implementation is based on PyTorch, but it does not specify the version number of PyTorch or any other software dependencies with their versions. |
| Experiment Setup | Yes | For all experiments, we use SGD optimizer with Nesterov momentum of 0.9, a weight decay of 5e-4 and a batch size of 128. For CIFAR-10 and CIFAR-100, we train our models for 200 epochs with an initial learning rate of 0.1, which is divided by 10 at 100 and 150 epochs. For ImageNet, we train the model for 100 epochs, with the learning rate warmed up to 0.1 in the first 5 epochs and then decayed by 10 at 30, 60, and 90 epochs. |