Adaptive Smoothing Gradient Learning for Spiking Neural Networks
Authors: Ziming Wang, Runhao Jiang, Shuang Lian, Rui Yan, Huajin Tang
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments on static images, dynamic event streams, speech, and instrumental sounds show the proposed method achieves state-of-the-art performance across all the datasets with remarkable robustness on different relaxation degrees. 5. Experiments To validate the effectiveness of the proposed method, we perform experiments on various types of data, including static images and spatio-temporal patterns such as dynamic event streams, spoken digital speech, and instrumental music. Additionally, we investigate the evolutions of network dynamics and the impact of noise probability to understand how injecting noise with spikes provides real observation about the loss landscape of target SNNs. |
| Researcher Affiliation | Academia | 1College of Computer Science, Zhejiang University 2College of Computer Science, Zhejiang University of Technology 3Research Center for Intelligent Computing Hardware, Zhejiang Lab. |
| Pseudocode | Yes | Algorithm 1 Core function in ASGL |
| Open Source Code | Yes | Code is available at https://github.com/Windere/ASGL-SNN |
| Open Datasets | Yes | To validate the effectiveness of the proposed method, we perform experiments on various types of data, including static images and spatio-temporal patterns such as dynamic event streams, spoken digital speech, and instrumental music. CIFAR-10, CIFAR-100, Tiny-Image Net, DVS-CIFAR10, Spiking Heidelberg Dataset (SHD), Medly DB and DVS128 Gesture datasets are used. |
| Dataset Splits | Yes | For training and evaluation, the dataset is split into a training set (8156 samples) and a test set (2264 samples). (SHD dataset) ...all samples are split into a training set (1208 samples) and a test set (134 samples). (DVS128 Gesture dataset) |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions software frameworks like Py Torch, MXNet, and Tensor Flow, and optimizers like ADAM and SGD, but it does not specify any version numbers for these software components. |
| Experiment Setup | Yes | In particular, we use the SGD optimizer with the weight decay of 5e-4 over 100 epochs (CIFAR-10) and 300 epochs (CIFAR-100). We utilize the ADAM optimizer with an initial learning rate λ = 0.001 for the CIFAR100 dataset when adjusting p, while for the CIFAR10 dataset, we employ the SGD optimizer with an initial learning rate of λ = 0.1. Additionally, a cyclic cosine annealing learning rate scheduler is adopted. For the SHD dataset, we discretize the time into 250 time steps... The corresponding network architecture is 700 240 20... For the Medly DB dataset, we increase the noise probability at 30-th, 70-th, 90-th, 95-th epoch with a discretization of 500 time steps. All the p and ζ values we use for each dataset are shown in Table 7 unless otherwise specified. |