Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting

Authors: Shikuang Deng, Yuhang Li, Shanghang Zhang, Shi Gu

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our method consistently outperforms the SOTA on all reported mainstream datasets, including CIFAR-10/100 and Image Net. Remarkably on DVS-CIFAR10, we obtained 83% top-1 accuracy, over 10% improvement compared to existing state of the art. Codes are available at https://github.com/Gus-Lab/temporal_ efficient_training. Our sufficient experiments on both static datasets and neuromorphic datasets prove the effectiveness of the TET method.
Researcher Affiliation Academia 1University of Electronic Science and Technology of China, 2Shenzhen Institute for Advanced Study, UESTC 3Yale University, 4Peking University ,5Peng Cheng Laboratory
Pseudocode Yes Algorithm 1: Temporal efficient training for one epoch
Open Source Code Yes Codes are available at https://github.com/Gus-Lab/temporal_ efficient_training.
Open Datasets Yes We validate our proposed TET algorithm and compare it with existing works on both static and neuromorphic datasets. The CIFAR dataset (Krizhevsky et al., 2009) consists of 50k training images and 10k testing images with the size of 32 32. Image Net (Deng et al., 2009) contains more than 1250k training images and 50k validation images. DVS-CIFAR10 (Li et al., 2017), the most challenging mainstream neuromorphic data set, is converted from CIFAR10.
Dataset Splits Yes The CIFAR dataset (Krizhevsky et al., 2009) consists of 50k training images and 10k testing images with the size of 32 32. Image Net (Deng et al., 2009) contains more than 1250k training images and 50k validation images. DVS-CIFAR10 (Li et al., 2017) ... Then, we split the dataset into 9k training images and 1k test images and reduce the spatial resolution to 48 48.
Hardware Specification No The paper does not specify the exact hardware (e.g., GPU model, CPU type) used for running its experiments.
Software Dependencies No The paper mentions the use of an 'ANN programming platform' but does not specify any software dependencies with version numbers.
Experiment Setup Yes We use an Adam optimizer with a learning rate of 0.01 and cosine decay to 0. Next, following the TIT algorithm, we increase the simulation time (to 4 and 6) and continue training the SNN for only 50 epochs, with the learning rate changing to 1e 4. We use an SGD optimizer with 0.9 momentum and weight decay 4e 5. The learning rate is set to 0.1 and cosine decay to 0. We train the SEW-Res Net34 (Fang et al., 2021) with T = 4 for 120 epochs. As for the Spiking-Res Net34 (Zheng et al., 2021), we use TIT algorithm to train 90 epochs with T = 4 first, then change the simulation time to 6 and finetune the network for 30 epochs. We adopt an Adam optimizer on the finetune phase and change the learning rate to 1e 4.