Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Class-Discriminative Attention Maps for Vision Transformers
Authors: Lennart Brocki, Jakub Binda, Neo Christopher Chung
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our quantitative benchmarks include correctness, compactness, and class sensitivity, in comparison to 7 other importance estimators. Vanilla, Smooth, and Integrated CDAM excel across all three benchmarks. In particular, our results suggest that existing importance estimators may not provide sufficient class-sensitivity. We demonstrate the utility of CDAM in medical images by training and explaining malignancy and biomarker prediction models based on lung Computed Tomography (CT) scans. |
| Researcher Affiliation | Academia | Lennart Brocki EMAIL Jakub Binda EMAIL Neo Christopher Chung EMAIL Institute of Informatics, University of Warsaw |
| Pseudocode | No | The paper describes methods using mathematical equations and text, but no explicitly labeled pseudocode block or algorithm section is present. |
| Open Source Code | Yes | Code available: https://github.com/lenbrocki/CDAM |
| Open Datasets | Yes | We conduct several quantitative evaluations focusing on correctness, class sensitivity, and compactness. By using the Image Net samples (Deng et al., 2009) with multiple objects (Beyer et al., 2020) and applying importance estimators for different classes, we quantify the level of class-discrimination. ... Lastly, we have applied CDAM on a Vi T fine-tuned on the Lung Image Database Consortium image collection (LIDC) (Armato III et al., 2011). |
| Dataset Splits | Yes | Training, validation, and test sets (in the ratios of 0.7225, 0.1275, 0.15) were stratified by and balanced according to these labels, e.g., benign and malignant. ... The LIDC dataset was split into 5 folds and stratified according to malignancy status. |
| Hardware Specification | No | This research was carried out with the support of the Interdisciplinary Centre for Mathematical and Computational Modelling University of Warsaw (ICM UW) under computational allocation no GDM-3540; the IDUB program (Excellence Initiative Research University), the NVIDIA Corporation s Academic Hardware Grant; and the Google Cloud Research Innovators program. While NVIDIA hardware and Google Cloud are mentioned, specific GPU/CPU models or detailed specifications are not provided in the paper. |
| Software Dependencies | No | We use random resized cropping and horizontal flipping with Py Torch default arguments as augmentation, Adam optimizer with learning rate 3 10 4, batch size of 128, and train for 10 epochs. The paper mentions PyTorch and Adam optimizer but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We use random resized cropping and horizontal flipping with Py Torch default arguments as augmentation, Adam optimizer with learning rate 3 10 4, batch size of 128, and train for 10 epochs. The parameters of the Vi T backbone are frozen during training, so only the classifier head is trainable. ... In a parameter sweep, we varied the number of trainable layers (10 50) and dropout rates (0.0 0.09), where the learning rate was exponentially decaying (α = 0.0003 and β = 0.95). The best accuracy on the test set of 0.85 was obtained with 50 trainable layers and dropout rate of 0.0031. |