Intriguing Properties of Vision Transformers
Authors: Muhammad Muzammal Naseer, Kanchana Ranasinghe, Salman H Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We systematically study this question via an extensive set of experiments encompassing three Vi T families and provide comparisons with a high-performing convolutional neural network (CNN). |
| Researcher Affiliation | Collaboration | Australian National University, ?Mohamed bin Zayed University of AI, +Stony Brook University, Monash University, Linköping University, University of California, Merced, Yonsei University, r Google Research |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | Yes | Code: https://git.io/Js15X. |
| Open Datasets | Yes | We consider visual recognition task with models pretrained on Image Net [2]. The effect of occlusion is studied on the validation set (50k images). |
| Dataset Splits | Yes | We consider visual recognition task with models pretrained on Image Net [2]. The effect of occlusion is studied on the validation set (50k images). |
| Hardware Specification | Yes | All the models are trained on 4 V100 GPUs. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA x.x). |
| Experiment Setup | Yes | Thus, we train models on SIN without applying any augmentation, label smoothing or mixup. |