reproducibilityindex.ai

Vision Transformers Are Robust Learners

Authors: Sayak Paul, Pin-Yu Chen2071-2081

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we study the robustness of the Vision Transformer (Vi T) against common corruptions and perturbations, distribution shifts, and natural adversarial examples. We use six different diverse Image Net datasets concerning robust classiﬁcation to conduct a comprehensive performance comparison of Vi T models and SOTA convolutional neural networks (CNNs), Big-Transfer. Through a series of six systematically designed experiments, we then present analyses that provide both quantitative and qualitative indications to explain why Vi Ts are indeed more robust learners.
Researcher Affiliation	Industry	Sayak Paul,1* Pin-Yu Chen 2* 1 Carted 2 IBM Research sayak@carted.com, pin-yu.chen@ibm.com
Pseudocode	No	The paper describes procedures and methods in paragraph form but does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	Code for reproducing our experiments is available at https://git.io/J3VO0.
Open Datasets	Yes	We use six different diverse Image Net datasets concerning robust classiﬁcation to conduct a comprehensive performance comparison of Vi T models and SOTA convolutional neural networks (CNNs), Big-Transfer. We consistently observe a better performance across all the variants of Vi T under different parameter regimes. We used 6 diverse Image Net datasets concerning different types of robustness evaluation.
Dataset Splits	No	The paper refers to the "Image Net-1k validation set" for sampling images for certain experiments but does not explicitly provide the specific percentages, sample counts, or a detailed methodology for creating the train/validation/test splits for reproduction.
Hardware Specification	No	The paper mentions "Google Cloud Platform credits" in the acknowledgements, but it does not specify any particular GPU models, CPU models, or detailed hardware specifications used for running the experiments.
Software Dependencies	No	The paper does not provide specific software names with version numbers for reproducibility (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	No	The paper discusses training strategies like using Adam or SGD and mentions dropout, but it does not provide concrete hyperparameter values (e.g., learning rates, batch sizes, number of epochs) for the main models' training setup. It only provides specific hyperparameters for adversarial attack generation (e.g., epsilon=0.002 for PGD, step size of 50 for Deep Fool) which are part of specific analyses rather than the general experimental setup.