Targeted Activation Penalties Help CNNs Ignore Spurious Signals

Authors: Dekai Zhang, Matt Williams, Francesca Toni

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the power of TAP against two state-of-the-art baselines on the MNIST benchmark and on two clinical image datasets, using four different CNN architectures. ... Our findings are supported by experiments (i) on MNIST (Le Cun, Cortes, and Burges 1998) using a simple two-layer CNN as a standard benchmark and, to show the efficacy of TAP on higher-stakes real world datasets, (ii) on two clinical datasets for pneumonia (Kermany et al. 2018) and osteoarthritis (Chen et al. 2019) using three commonly used architectures: VGG-16, Res Net-18 (He et al. 2016) and Dense Net-121 (Huang et al. 2017).
Researcher Affiliation Academia Dekai Zhang1, Matt Williams2,3, Francesca Toni1 1Department of Computing, Imperial College London 2Department of Radiotherapy, Charing Cross Hospital 3Institute of Global Health Innovation, Imperial College London
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes 1Source code: https://github.com/dkaizhang/TAP
Open Datasets Yes Our findings are supported by experiments (i) on MNIST (Le Cun, Cortes, and Burges 1998) using a simple two-layer CNN as a standard benchmark and, to show the efficacy of TAP on higher-stakes real world datasets, (ii) on two clinical datasets for pneumonia (Kermany et al. 2018) and osteoarthritis (Chen et al. 2019) using three commonly used architectures: VGG-16, Res Net-18 (He et al. 2016) and Dense Net-121 (Huang et al. 2017).
Dataset Splits Yes We use the pre-defined training and test splits for all datasets and reserve 10% of the training split for validation.
Hardware Specification Yes Experiments were implemented with Py Torch 1.13 and run on a Linux Ubuntu 18.04 machine with an Nvidia RTX 3080 GPU with 10GB VRAM.
Software Dependencies Yes Experiments were implemented with Py Torch 1.13 and run on a Linux Ubuntu 18.04 machine with an Nvidia RTX 3080 GPU with 10GB VRAM.
Experiment Setup Yes We choose cross-entropy for the task loss LT ask. We use SGD as optimiser with weight decay of 0.9. We train for 50 epochs. We use random initialisation for the two-layer CNN and use a learning rate of 10-3. For VGG-16, Res Net-18 and Dense Net-121 we initialise with Image Net-weights and use a learning rate of 10-5. We use a batch size of 256 for MNIST and 16 for the medical datasets (8 for Dense Net-121 with RBR, given memory constraints).