Statistical Test for Attention Maps in Vision Transformers
Authors: Tomohiro Shiraishi, Daiki Miwa, Teruyuki Katsuoka, Vo Nguyen Le Duy, Kouichi Taji, Ichiro Takeuchi
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the validity and the effectiveness of the proposed method through numerical experiments and applications to brain image diagnoses. and 4. Numerical Experiments |
| Researcher Affiliation | Academia | 1Nagoya University, Aichi, Japan 2Nagoya Institute of Technology, Aichi, Japan 3University of Information Technology, Ho Chi Minh City, Vietnam 4Vietnam National University, Ho Chi Minh City, Vietnam 5RIKEN, Tokyo, Japan. |
| Pseudocode | Yes | Algorithm 1 Selective p-value Computation by Adaptive Grid Search |
| Open Source Code | Yes | For reproducibility, our implementation is available at https://github.com/shirara1016/ statistical_test_for_vit_attention. |
| Open Datasets | Yes | We examined the brain image dataset extracted from the dataset used in Buda et al. (2019), which included 939 and 941 images with and without tumors, respectively. |
| Dataset Splits | No | The paper states specific numbers for training and testing data for the brain image dataset ('700 images each with and without tumors for training' and 'The remaining images with and without tumors were used for testing'), but it does not specify a separate validation split or explicit percentages/counts for all splits needed for reproduction across all datasets. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for conducting the experiments. |
| Software Dependencies | No | The paper mentions using 'TensorFlow' for auto differentiation but does not provide specific version numbers for TensorFlow or any other software dependencies. |
| Experiment Setup | Yes | In all experiments, we set the threshold value τ = 0.6, the grid search interval [ S, S] with S = 10 + |zobs|, the minimum grid width εmin = 10 4, the maximum grid width εmax = 0.2, and the significance level α = 0.05. |