Group Equivariant Stand-Alone Self-Attention For Vision

Authors: David W. Romero, Jean-Baptiste Cordonnier

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments on vision benchmarks demonstrate consistent improvements of GSA-Nets over non-equivariant self-attention networks.
Researcher Affiliation Academia David W. Romero Vrije Universiteit Amsterdam d.w.romeroguzman@vu.nl Jean-Baptiste Cordonnier Ecole Polytechnique F ed erale de Lausanne (EPFL) jean-baptiste.cordonnier@epfl.ch
Pseudocode No The paper presents mathematical formulations and conceptual diagrams (Fig. B.1, Fig. B.2) but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is publicly available at https://github.com/dwromero/g selfatt.
Open Datasets Yes Rot MNIST. The rotated MNIST dataset (Larochelle et al., 2007)...CIFAR-10. The CIFAR-10 dataset (Krizhevsky et al., 2009)...PCam. The Patch Camelyon dataset (Veeling et al., 2018)
Dataset Splits Yes Rot MNIST...divided into training, validation and test sets of 10k, 2k and 50k images...CIFAR-10...divided into training, validation and test sets of 40k, 10k and 10k images.
Hardware Specification No The paper mentions GPU usage in Table 1 (e.g., '1GPU', '2GPU'), but it does not specify concrete hardware details like GPU models (e.g., NVIDIA A100, RTX 2080 Ti) or CPU types.
Software Dependencies No The paper states 'We utilize Py Torch for our implementation' but does not provide specific version numbers for PyTorch or other software dependencies.
Experiment Setup Yes For rotational MNIST...We train for 300 epochs and utilize the Adam optimizer, batch size of 8, weight decay of 0.0001 and learning rate of 0.001.