reproducibilityindex.ai

AutoSampling: Search for Effective Data Sampling Schedules

Authors: Ming Sun, Haoxuan Dou, Baopu Li, Junjie Yan, Wanli Ouyang, Lei Cui

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We apply our Auto Sampling method to a variety of image classiﬁcation tasks illustrating the effectiveness of the proposed method. Comprehensive experiments on CIFAR-10/100 and Image Net datasets (Krizhevsky, 2009; Deng et al., 2009) with different networks show that the Autosampling can increase the top-1 accuracy by up to 2.85% on CIFAR-10, 2.19% on CIFAR-100, and 2.83% on Image Net.
Researcher Affiliation	Collaboration	1Sense Time Research 2Baidu USA LLC 3 The University of Sydney.
Pseudocode	Yes	Algorithm 1 The Multi-Exploitation Step, Algorithm 2 Search based Auto Sampling
Open Source Code	No	The paper does not provide an explicit statement or link to open-source code for the described methodology.
Open Datasets	Yes	Comprehensive experiments on CIFAR-10/100 and Image Net datasets (Krizhevsky, 2009; Deng et al., 2009)
Dataset Splits	No	The paper mentions evaluating on a 'held-out validation dataset' but does not provide specific details on its size, split percentage, or how it's created from the main datasets.
Hardware Specification	Yes	each worker is for one Nvidia V100 GPU card. For each worker we utilize eight NVidia V100 GPU cards
Software Dependencies	No	The paper does not provide specific version numbers for software dependencies or libraries used in the experiments.
Experiment Setup	Yes	In particular, for model training we use the base learning rate of 0.1 and a step decay learning rate schedule where the learning rate is divided by 10 after each 60 epochs. We run the experiments for 240 epochs. In addition, we set the training batch size to be 128 per worker. For Image Net which consists of 1.28 million training images, we adopted the base learning rate of 0.2 and a cosine decay learning rate schedule. We run the experiments with 100 epochs of training. For each worker we utilize eight NVidia V100 GPU cards and a total batch size of 512.