Sign Gradient Descent-based Neuronal Dynamics: ANN-to-SNN Conversion Beyond ReLU Network
Authors: Hyunseok Oh, Youngki Lee
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on large-scale datasets show that our technique achieves (i) state-of-the-art performance in ANN-to-SNN conversion and (ii) is the first to convert new DNN architectures, e.g., Conv Next, MLP-Mixer, and Res MLP. and 6. Evaluations We demonstrate the practical effectiveness of our sign GD-based neuronal dynamics in four-fold. First, we validate our support for diverse nonlinearities by converting new DNN architectures. Second, we compare the accuracy of converted ANNs with existing conversion techniques. Third, we verify our design choices through ablation studies. Finally, we visualize the effect of our technique on SNN inference speed. |
| Researcher Affiliation | Academia | 1Department of Computer Science & Engineering, Seoul National University, Seoul, Republic of Korea. |
| Pseudocode | Yes | Algorithm 4 ANN-to-SNN Conversion with sign GD-based Neuron (Definition 5.2) |
| Open Source Code | Yes | We publicly share our source code at www.github.com/snuhcs/snn signgd . |
| Open Datasets | Yes | Experimental results on large-scale Image Net (Deng et al., 2009) and CIFAR (Krizhevsky et al., 2009) datasets |
| Dataset Splits | No | No explicit training/validation/test dataset splits (e.g., percentages or counts) are provided in the paper. It refers to 'training dataset' and 'random 100 batches' for normalization but not specific splits. |
| Hardware Specification | Yes | We run our experiments on a machine with AMD EPYC 7313 CPU, 512GB RAM, and NVIDIA RTX A6000. |
| Software Dependencies | No | The paper mentions 'spikingjelly (Fang et al., 2023) implementation', 'torchvision (maintainers & contributors, 2016)', and 'timm (Wightman et al., 2019)' as software used. However, specific version numbers for these software components are not provided. |
| Experiment Setup | Yes | To train DNN models for CIFAR datasets, we use SGD with learning rate 0.1, momentum 0.9, weight decay 5e-4, and cosine annealing schedule (Loshchilov & Hutter, 2016) of Tmax = 300. |