Deep Perturbation Learning: Enhancing the Network Performance via Image Perturbations

Authors: Zifan Song, Xiao Gong, Guosheng Hu, Cairong Zhao

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the effectiveness of our DPL on 6 datasets (CIFAR-10, CIFAR-100, Image Net, MS-COCO, PASCAL VOC, and SBD) over 3 popular vision tasks (image classification, object detection, and semantic segmentation) with different backbone architectures (e.g., Res Net, Mobile Net, and Vi T).
Researcher Affiliation Collaboration 1Department of Computer Science and Technology, Tongji University, Shanghai, China 2State Key Laboratory of Integrated Services Networks, Xidian University, Shaanxi, China 3Department of Mathematics, Nanjing University, Nanjing, China 4Oosto, 38330 Belfast, U.K..
Pseudocode Yes Algorithm 1 Deep Perturbation Learning (DPL) Require: D0 = {(x0 i , yi)}N i=1, θ0, n, α, fθ Ensure: θT 1: for t=0,1,...,T-1 do...
Open Source Code No The paper does not provide a specific repository link or an explicit statement about releasing the source code for the described methodology.
Open Datasets Yes We evaluate the generalization capacity of DPL on 6 datasets (CIFAR-10, CIFAR100 (Krizhevsky et al., 2009), Image Net (Deng et al., 2009), MS-COCO (Lin et al., 2014), PASCAL VOC (Everingham et al., 2010), and SBD (Hariharan et al., 2011))
Dataset Splits Yes The Image Net-1k (Deng et al., 2009) dataset contains 1,281,167 training samples and 50,000 validation samples of 1000 classes.
Hardware Specification Yes For the Image Net and COCO datasets, we conduct experiments on 4 Tesla V100S. The experiments on the other datasets are performed on 2 RTX 3090.
Software Dependencies No We use Py Torch (Paszke et al., 2017) to implement and train all the corresponding models in this paper. No specific version numbers for PyTorch or other software dependencies are provided.
Experiment Setup Yes For the image classification task, we train the networks with batch size 128 for 200 epochs on CIFAR datasets and with random cropping and flipping. The networks are trained with stochastic gradient descent (SGD) (momentum = 0.9). The initial learning rate is set to 0.1 and then decay by 0.01 at 160 epochs and again at 180 epochs. For the object detection task, we train the models with a batch size 32 and an input size 300 x 300 for PASCAL VOC. We use the learning rate of 10-3 for 80k iterations, then continue training for 20k, 20k iterations with 10-4, 10-5, respectively. For the semantic segmentation task, we train our networks with batch size 16. We set the initial learning rate, momentum and weight decay to 0.009, 0.9 and 0.0001, respectively. We use scaling (0.5 to 2.0), cropping and flipping for all the training data.