Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
SWAT: Spatial Structure Within and Among Tokens
Authors: Kumara Kahatapitiya, Michael S. Ryoo
IJCAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our family of models, SWAT on image classification and semantic segmentation. We use Imagenet-1K [Deng et al., 2009] and ADE20K [Zhou et al., 2019] as benchmarks to compare against common Transformer/Mixer/Conv architectures such as Dei T [Touvron et al., 2021b], Swin [Liu et al., 2021], MLP-Mixer [Tolstikhin et al., 2021], Res MLP [Touvron et al., 2021a] and VAN [Guo et al., 2022]. |
| Researcher Affiliation | Academia | Kumara Kahatapitiya and Michael S. Ryoo Stony Brook University EMAIL |
| Pseudocode | No | The paper describes its methods using diagrams and text but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at github.com/kkahatapitiya/SWAT. |
| Open Datasets | Yes | We use Imagenet-1K [Deng et al., 2009] and ADE20K [Zhou et al., 2019] as benchmarks to compare against common Transformer/Mixer/Conv architectures... |
| Dataset Splits | Yes | Image Net-1K [Deng et al., 2009] is a commonly-used classification benchmark, with 1.2M training images and 50K validation images, annotated with 1000 categories. and ADE20K [Zhou et al., 2019] benchmark contains annotations for semantic segmentation across 150 categories. It comes with 25K annotated images in total, with 20K training, 2K validation and 3K testing. |
| Hardware Specification | Yes | FPS is measured on a single V100 GPU. |
| Software Dependencies | No | The paper mentions using the 'timm' library, 'mmsegmentation' framework, and 'PyTorch-like' implementations, but does not provide specific version numbers for any of these software components. |
| Experiment Setup | Yes | For all our models, we report Top-1 (%) accuracy on single-crop evaluation with complexity metrics such as Parameters and FLOPs. We train all our models for 300 epochs on inputs of 224x224 using the timm [Wightman, 2019] library. We use the original hyperparameters for all backbones, without further tuning. All models are trained with Mixed Precision. |