Diversified Batch Selection for Training Acceleration

Authors: Feng Hong, Yueming Lyu, Jiangchao Yao, Ya Zhang, Ivor Tsang, Yanfeng Wang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across various tasks demonstrate the significant superiority of Div BS in the performance-speedup trade-off.
Researcher Affiliation Collaboration 1Cooperative Medianet Innovation Center, Shanghai Jiao Tong University, Shanghai, China 2CFAR, Agency for Science, Technology and Research (A*STAR), Singapore 3IHPC, Agency for Science, Technology and Research (A*STAR), Singapore 4Shanghai AI Laboratory, Shanghai, China 5College of Computing and Data Science, NTU, Singapore.
Pseudocode Yes Algorithm 1 The greedy algorithm.
Open Source Code Yes The code is publicly available.
Open Datasets Yes Datasets. We conduct experiments to evaluate our Div BS on CIFAR-10 (Krizhevsky et al., 2009), CIFAR100 (Krizhevsky et al., 2009), and Tiny Image Net (Le & Yang, 2015) for image classification.
Dataset Splits No The paper mentions using standard datasets like CIFAR-10, CIFAR-100, and Tiny Image Net, but does not explicitly state the training/validation/test splits, nor does it refer to specific predefined validation splits with citations for these datasets.
Hardware Specification No The paper mentions general 'hardware techniques' in the discussion but does not provide specific details such as GPU models, CPU types, or other hardware specifications used for running the experiments.
Software Dependencies No The paper mentions using specific models (e.g., ResNet, DeepLabV3, MobileNet) and optimizers (SGD, AdamW), but does not list any software dependencies with specific version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Models are trained using SGD with momentum of 0.9 and weight decay of 0.005 as the optimizer. The initial learning rate is set to 0.1. We train the model for 200 epochs with the cosine learning-rate scheduling.