DynaMS: Dyanmic Margin Selection for Efficient Deep Learning

Authors: Jiaxing Wang, Yong Li, Jingwei Zhuo, Xupeng Shi, WEIZHONG ZHANG, Lixing Gong, Tong Tao, Pengzhang Liu, Yongjun Bao, Weipeng Yan

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive analysis and experiments demonstrate the superiority of the proposed approach in data selection against many state-of-the-art counterparts on benchmark datasets.
Researcher Affiliation Collaboration Jiaxing Wang1, Yong Li 1, Jingwei Zhuo1, Xupeng Shi2, Weizhong Zhang3, Lixing Gong1, Tong Tao1, Pengzhang Liu1, Yongjun Bao1, Weipeng Yan1 1JD.com 2Northeastern University 3Fudan University
Pseudocode Yes Algorithm 1 Margin selection: MS(w, T , γ) ... Algorithm 2 Dynamic margin selection (Dyna MS) ... Algorithm 3 Dynamic margin selection (Dyna MS) with parameter sharing proxy (PSP)
Open Source Code Yes Code is available at https://github.com/ylfzr/Dyna MS-subset-selection.
Open Datasets Yes We conduct experiments on CIFAR-10 Krizhevsky & Hinton (2009) and Image Net Jia et al. (2009)
Dataset Splits Yes We conduct experiments on CIFAR-10 Krizhevsky & Hinton (2009) and Image Net Jia et al. (2009), following standard data pre-processing in He et al. (2016). For CIFAR-10, we train Res Net-18 (He et al., 2016) for 200 epochs. ... For Image Net, we choose Res Net-18 and Res Net-50 as base models. Following the conventions, the total training epoch is 120.
Hardware Specification Yes We conduct experiments on a NVIDIA Ampere A-100. ... We conduct experiments on four NVIDIA Ampere A-100s.
Software Dependencies No No explicit mention of specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) found.
Experiment Setup Yes 4.1 EXPERIMENTAL SETUP and Table 8: Hyper-parameters of Dyna MS for different models on CIFAR-10 and Image Net. Batch Size 128 512 512 Init. Learning Rate of W 0.1 0.1 0.1 Learning Rate Decay Stepwise 0.2 Stepwise 0.1 Stepwise 0.1 Lr Decay milestones {60,120,160} {40,80} {40,80} Optimizer SGD SGD SGD Momentum 0.9 0.9 0.9 Nestrov True True True Weight Decay 5e-4 1e-4 1e-4 Max Epochs 200 120 120 Selection interval 10 10 10