Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers
Authors: Johann Schmidt, Sebastian Stober
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluated our method on several benchmark datasets, including a synthesised Image Net testset. ITSoutperformsthe utilised baselines on all zero-shot test scenarios. |
| Researcher Affiliation | Academia | 1Artificial Intelligence Lab, Otto-von-Guericke University, Magdeburg, Germany. Correspondence to: Johann Schmidt <johann.schmidt@ovgu.de>. |
| Pseudocode | Yes | Supplementary to the descriptions and illustrations in Figure 4 and Figure 5, we provide the pseudocode of our proposed algorithm in Algorithm 1. |
| Open Source Code | Yes | More details can be found in our publicly available source code.2 2www.github.com/joh Schm/ITS |
| Open Datasets | Yes | We evaluated our method on several benchmark datasets, including a synthesised Image Net testset. [...] We trained a CNN, a GCNN (Cohen & Welling, 2016) and a Rot DCF (Cheng et al., 2018) on the vanilla (canonical) MNIST. [...] Si-Score (short SI) (Djolonga et al., 2021) is a synthetic vision dataset for robustness testing, comprising semantically masked Image Net (Russakovsky et al., 2015) objects |
| Dataset Splits | Yes | We split the vanilla datasets into disjunct training, validation, and test sets. We always employ the vanilla training set to fit the model and validate it on the vanilla validation set. |
| Hardware Specification | Yes | All experiments are performed on an Nvidia A40 GPU (48GB) node with 1 TB RAM, 2x 24core AMD EPYC 74F3 CPU @ 3.20GHz, and a local SSD (NVMe). |
| Software Dependencies | No | The software specifications of our implementations can be found in our open-sourced code. |
| Experiment Setup | Yes | If not further specified, we used zero-padding to define areas outside the pixel space Ω, bilinear interpolation, and a group cardinality of n = 17. [...] These models are trained with the Adam W optimizer (Loshchilov & Hutter, 2017) using default parameters. We minimised the negative log-likelihood using ground-truth image labels. We used a learning rate of 5e 3, 3 epochs for MNIST, 5 epochs for Fashion-MNIST, 10 epochs for GTSRB and mini-batches of size 128. |