LipsFormer: Introducing Lipschitz Continuity to Vision Transformers

Authors: Xianbiao Qi, Jianan Wang, Yihao Chen, Yukai Shi, Lei Zhang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that Lips Former allows stable training of deep Transformer architectures without the need of careful learning rate tuning such as warmup, yielding a faster convergence and better generalization. As a result, on the Image Net 1K dataset, Lips Former-Swin-Tiny based on Swin Transformer training for 300 epochs can obtain 82.7% without any learning rate warmup.
Researcher Affiliation Academia International Digital Economy Academy (IDEA), Shenzhen, Guangdong, China. {qixianbiao,wangjianan,chenyihao,shiyukai,leizhang}@idea.edu.cn
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about releasing its source code or a link to a code repository.
Open Datasets Yes We evaluate Lips Former-CSwin on the standard Image Net-1K [15] dataset, which consists of 1.28M images and 1,000 classes.
Dataset Splits Yes We evaluate Lips Former-CSwin on the standard Image Net-1K [15] dataset, which consists of 1.28M images and 1,000 classes. Following Dei T III [47], we also evaluate our method on Image Net-v2 [43] and Image Net-real [4] data sets.
Hardware Specification Yes All the models are implemented with Py Torch, and trained on NVIDIA Tesla A100 GPUs.
Software Dependencies No All the models are implemented with Py Torch, and trained on NVIDIA Tesla A100 GPUs. However, no specific version number for PyTorch or other software dependencies is provided.
Experiment Setup Yes In Table 6 we provide the Image Net 1K training details used for producing the main results in Table 2. All Lips Former variants use the same training hyperparameters, except for Drop Path ratio, weight decay, learning rate and EMA. All the models are implemented with Py Torch, and trained on NVIDIA Tesla A100 GPUs.