Harnessing Hard Mixed Samples with Decoupled Regularizer
Authors: Zicheng Liu, Siyuan Li, Ge Wang, Lirong Wu, Cheng Tan, Stan Z. Li
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on supervised and semi-supervised learning benchmarks across seven datasets validate the effectiveness of DM as a plug-and-play module. |
| Researcher Affiliation | Academia | Zicheng Liu1,2, Siyuan Li1,2, Ge Wang1,2 Chen Tan1,2 Lirong Wu1,2 Stan Z. Li2, AI Lab, Research Center for Industries of the Future, Hangzhou, China; 1Zhejiang University; 2Westlake University; {liuzicheng; lisiyuan; wangge; tanchen; lirongwu; stan.zq.li} @westlake.edu.cn |
| Pseudocode | No | The paper describes its methods using mathematical equations and textual explanations, but it does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Source code and models are available at https://github.com/Westlake-AI/openmixup. |
| Open Datasets | Yes | This subsection evaluates performance gains of DM on six image classification benchmarks, including CIFAR-100 [25], Tiny-Image Net (Tiny) [10], Image Net-1k [48], CUB-200-2011 (CUB) [59], FGVCAircraft (Aircraft) [42]. For semi-supervised transfer learning (TL) benchmarks [71], we perform TL experiments on CUB, Aircraft, and Stanford-Cars [24] (Cars). |
| Dataset Splits | Yes | Tiny-Image Net [10] is a rescaled version of Image Net-1k, which has 10,000 training images and 10,000 validation images of 200 classes in 64 64 resolutions. Image Net-1k [26] contrains 1,281,167 training images and 50,000 validation images of 1000 classes in 224 224 resolutions. |
| Hardware Specification | Yes | Total training hours and GPU memory are collected on a single A100 GPU. |
| Software Dependencies | No | The paper mentions software like PyTorch, SGD, AdamW, LAMB optimizers, Open Mixup [32], and Torch SSL [76], but it does not specify version numbers for these software components or other dependencies such as Python or CUDA versions. |
| Experiment Setup | Yes | SGD optimizer and Cosine learning rate Scheduler [39] are used with the SGD weight decay of 0.0001, the momentum of 0.9, and the Batch size of 100; all methods train 800 epochs with the basic learning rate lr = 0.1 on CIFAR-100 and 400 epochs with lr = 0.2 on Tiny-Image Net. |