Finding Visual Saliency in Continuous Spike Stream

Authors: Lin Zhu, Xianzhang Chen, Xiao Wang, Hua Huang

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the superior performance of our Recurrent Spiking Transformer framework in comparison to other spike neural network-based methods. Our framework exhibits a substantial margin of improvement in capturing and highlighting visual saliency in the spike stream, which not only provides a new perspective for spike-based saliency segmentation but also shows a new paradigm for full SNN-based transformer models.
Researcher Affiliation Academia 1 School of Computer Science and Technology, Beijing Institute of Technology, China 2 School of Artificial Intelligence, Beijing Normal University, China 3 School of Computer Science and Technology, Anhui University, China
Pseudocode No The paper describes the model's components and equations, but does not contain a structured pseudocode or algorithm block.
Open Source Code Yes The code and dataset are available at https://github.com/BIT-Vision/SVS.
Open Datasets Yes To facilitate the training and validation of our proposed model, we build a comprehensive real-world spike-based visual saliency dataset, enriched with numerous light conditions.Our dataset will be available to the research community for further investigation and exploration.The code and dataset are available at https://github.com/BIT-Vision/SVS.
Dataset Splits Yes To facilitate training and evaluation, we partition the dataset into a training set and a validation set. The training set encompasses 100 sequences, encompassing 20,000 annotated frames, while the validation set consists of 30 sequences with 6,000 annotated frames.
Hardware Specification No The paper discusses power consumption metrics but does not provide specific details about the hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup Yes For a fair comparison, we use the same setting for all methods. Adam W is used to train 20 epochs for all models and the initial learning rate is set to 2 10 5, which linearly decays with the epoch until 2 10 6. We use 256 256 as the input size and the time interval of spike data is set to 0.02s, which means the methods will get 400 frames spike at each iteration. We respectively train the model on two settings: single-step and multi-step.