Differentiable hierarchical and surrogate gradient search for spiking neural networks

Authors: Kaiwei Che, Luziwei Leng, Kaixuan Zhang, Jianguo Zhang, Qinghu Meng, Jie Cheng, Qinghai Guo, Jianxing Liao

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that our methods outperform SNNs based on sophisticated ANN architectures on image classification of CIFAR10, CIFAR100 and Image Net datasets. Our models achieve state-of-the-art performances on classification of CIFAR10/100 and Image Net with accuracy of 95.50%, 76.25% and 68.64%.
Researcher Affiliation Collaboration Kaiwei Che1,2 , Luziwei Leng1,2 , Kaixuan Zhang1,2, Jianguo Zhang1, Max Q.-H. Meng1, Jie Cheng2, Qinghai Guo2, Jiangxing Liao2 1 Southern University of Science and Technology, China 2 ACS Lab, Huawei Technologies, Shenzhen, China
Pseudocode Yes Algorithm 1: Differentiable surrogate gradient search (DGS)
Open Source Code Yes Codes are available at https://github.com/Huawei-BIC/Spike DHS.
Open Datasets Yes The CIFAR10 and CIFAR100 datasets [28] have 50K/10K training/testing RGB images with a spatial resolution of 32 32. The Image Net dataset [12] contains more than 1250k training images and 50k test images. We further apply our method to event-based deep stereo matching on the widely used benchmark MVSEC dataset [80].
Dataset Splits Yes In the search phase, the training set is equally split into two subsets for bi-level optimazation. For retraining, the standard training/testing split is used.
Hardware Specification Yes The architecture search takes about 1.4 GPU day on a single NVIDIA Tesla V100 (32G) GPU.
Software Dependencies No The paper mentions using a 'Py Torch package' for operation counting but does not specify version numbers for PyTorch or any other key software dependencies required to reproduce the experiments.
Experiment Setup Yes The search phase takes 50 epochs with mini-batch size 50, the first 15 epochs are used to warm up convolution weights. We use SGD optimizer with momentum 0.9 and a learning rate of 0.025. After search, we retrain the model on target datasets with channel expansion for 100 epochs with mini-batch size 50 for CIFAR and 160 for Image Net, with cosine learning rate 0.025. We use SGD optimizer with weight decay 3e 4 and momentum 0.9.