DynaMS: Dyanmic Margin Selection for Efficient Deep Learning
Authors: Jiaxing Wang, Yong Li, Jingwei Zhuo, Xupeng Shi, WEIZHONG ZHANG, Lixing Gong, Tong Tao, Pengzhang Liu, Yongjun Bao, Weipeng Yan
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive analysis and experiments demonstrate the superiority of the proposed approach in data selection against many state-of-the-art counterparts on benchmark datasets. |
| Researcher Affiliation | Collaboration | Jiaxing Wang1, Yong Li 1, Jingwei Zhuo1, Xupeng Shi2, Weizhong Zhang3, Lixing Gong1, Tong Tao1, Pengzhang Liu1, Yongjun Bao1, Weipeng Yan1 1JD.com 2Northeastern University 3Fudan University |
| Pseudocode | Yes | Algorithm 1 Margin selection: MS(w, T , γ) ... Algorithm 2 Dynamic margin selection (Dyna MS) ... Algorithm 3 Dynamic margin selection (Dyna MS) with parameter sharing proxy (PSP) |
| Open Source Code | Yes | Code is available at https://github.com/ylfzr/Dyna MS-subset-selection. |
| Open Datasets | Yes | We conduct experiments on CIFAR-10 Krizhevsky & Hinton (2009) and Image Net Jia et al. (2009) |
| Dataset Splits | Yes | We conduct experiments on CIFAR-10 Krizhevsky & Hinton (2009) and Image Net Jia et al. (2009), following standard data pre-processing in He et al. (2016). For CIFAR-10, we train Res Net-18 (He et al., 2016) for 200 epochs. ... For Image Net, we choose Res Net-18 and Res Net-50 as base models. Following the conventions, the total training epoch is 120. |
| Hardware Specification | Yes | We conduct experiments on a NVIDIA Ampere A-100. ... We conduct experiments on four NVIDIA Ampere A-100s. |
| Software Dependencies | No | No explicit mention of specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions) found. |
| Experiment Setup | Yes | 4.1 EXPERIMENTAL SETUP and Table 8: Hyper-parameters of Dyna MS for different models on CIFAR-10 and Image Net. Batch Size 128 512 512 Init. Learning Rate of W 0.1 0.1 0.1 Learning Rate Decay Stepwise 0.2 Stepwise 0.1 Stepwise 0.1 Lr Decay milestones {60,120,160} {40,80} {40,80} Optimizer SGD SGD SGD Momentum 0.9 0.9 0.9 Nestrov True True True Weight Decay 5e-4 1e-4 1e-4 Max Epochs 200 120 120 Selection interval 10 10 10 |