Spatial-Temporal Self-Attention for Asynchronous Spiking Neural Networks
Authors: Yuchen Wang, Kexin Shi, Chengzhuo Lu, Yuguo Liu, Malu Zhang, Hong Qu
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments are conducted on popular neuromorphic datasets and speech datasets, including DVS128 Gesture, CIFAR10-DVS, and Google Speech Commands, and our experimental results can outperform other state-of-the-art models. |
| Researcher Affiliation | Academia | Yuchen Wang , Kexin Shi , Chengzhuo Lu , Yuguo Liu , Malu Zhang and Hong Qu School of Computer Science and Engineering, University of Electronic Science and Technology of China yuchenwang@std.uestc.edu.cn, kexinshi@std.uestc.edu.cn, 2019270101012@std.uestc.edu.cn, liuyuguo@std.uestc.edu.cn, maluzhang@uestc.edu.cn, hongqu@uestc.edu.cn |
| Pseudocode | No | The paper describes the proposed methods using mathematical formulas and prose, but it does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Codes are available at https://github.com/ppppps/STSA 4 Asyn SNN. |
| Open Datasets | Yes | In order to verify the effectiveness of the proposed method, we conduct object recognition experiments on neuromorphic vision datasets DVS128 Gesture [Amir et al., 2017] and CIFAR10-DVS [Li et al., 2017], and speech recognition experiments on Google Speech Commands V1 and Google Speech Commands V2 [Warden, 2018]. |
| Dataset Splits | No | For DVS128 Gesture: "the owner of the dataset divides 1,176 of them into the training set and 288 into the test set." For CIFAR10-DVS: "researchers divide the first 900 samples of each category into the training set and the remaining 100 samples into the test set. We also used this 9:1 division ratio in our experiments." For Google Speech Commands: "randomly select 1500 samples per command and split them into the training sets and test sets at a ratio of 8:2." No explicit validation split is mentioned for any of the datasets. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments (e.g., GPU models, CPU types, or cloud resources). |
| Software Dependencies | No | The paper mentions software components like "Adam W" optimizer and "temporal efficient training," but it does not provide specific version numbers for any software libraries, frameworks, or programming languages. |
| Experiment Setup | Yes | The initial learning rate is set to 0.01 and we use a cosine learning rate decay schedule. We adopt the loss function of temporal efficient training [Deng et al., 2022] and the L2 penalty with a value of 1e 4 is also added. In the tokenization process, we used four 3 3 convolutional layers in the convolutional stem. A max-pooling layer with a step size of 2 is followed by each convolutional layer to divide the original image into 16 16 patches. The LIF neurons of the constructed SNNs adopted a uniform setting, their firing threshold was set to 1, and the decay coefficient τ was set to 0.5. The batch size of both training and testing is 32, and the number of epochs is set to 1000. |