Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Spiking Vision Transformer with Saccadic Attention

Authors: Shuai Wang, Malu Zhang, Dehao Zhang, Ammar Belatreche, Yichen Xiao, Yu Liang, Yimeng Shan, Qian Sun, Enqi Zhang, Yang Yang

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across various visual tasks demonstrate that SNN-Vi T achieves state-of-the-art performance with linear computational complexity.
Researcher Affiliation Academia 1University of Electronic Science and Technology of China 2Northumbria University, 3Liaoning Technical University
Pseudocode No The paper describes methods and models using mathematical equations and textual explanations, but it does not include explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide an explicit statement about the release of source code or a link to a code repository for the methodology described.
Open Datasets Yes SNN-Vi T is evaluated on both static and neuromorphic datasets, including CIFAR10, CIFAR100 (Krizhevsky et al., 2009), Image Net (Deng et al., 2009) and CIFAR10-DVS (Li et al., 2017). Specifically, for Image Net, the input image size is 3 224 224, with batch sizes of 128, and training epoch is conducted over 310. Our experimental results are summarized in Table.1 and 2. We select two remote sensing datasets: NWPU VHR-10 Cheng et al. (2017) and SSDD (Wang et al., 2019).
Dataset Splits No The paper mentions datasets like CIFAR10, CIFAR100, Image Net, CIFAR10-DVS, NWPU VHR-10, and SSDD, but it does not explicitly provide specific training/test/validation dataset splits (e.g., percentages, sample counts, or clear references to predefined splits for each dataset) within the main text.
Hardware Specification Yes Experiments are carried out on a high-performance computing platform equipped with an NVIDIA RTX4090 GPU, using Stochastic Gradient Descent (SGD) as the optimization algorithm.
Software Dependencies No The paper mentions 'Stochastic Gradient Descent (SGD) as the optimization algorithm' and 'YOLO-v3 architecture' but does not provide specific software dependency names with version numbers for libraries, frameworks, or programming languages.
Experiment Setup Yes Specifically, for Image Net, the input image size is 3 224 224, with batch sizes of 128, and training epoch is conducted over 310. The initial learning rate is set at 1 10 2, adjusted according to a polynomial decay strategy. The entire training process spans 300 epochs on the NWPU-VHR-10 and SSDD datasets, ensuring comprehensive learning and adaptation to data characteristics. In this configuration, the expansion ratio of the Multi-Layer Perceptron (MLP) is fixed at 4 to achieve an optimal balance between computational efficiency and model performance.