Reducing ANN-SNN Conversion Error through Residual Membrane Potential

Authors: Zecheng Hao, Tong Bu, Jianhao Ding, Tiejun Huang, Zhaofei Yu

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results show that the proposed method achieves state-of-the-art performance on CIFAR-10, CIFAR-100, and Image Net datasets.
Researcher Affiliation Academia Zecheng Hao1, Tong Bu1, Jianhao Ding1, Tiejun Huang1, Zhaofei Yu1,2* 1 School of Computer Science, Peking University 2 Institute for Artificial Intelligence, Peking University zechenghao@pku.edu.cn, putong30@pku.edu.cn, djh01998@stu.pku.edu.cn, {tjhuang,yuzf12}@pku.edu.cn
Pseudocode Yes Algorithm 1: The optimization strategy based on residual membrane potential (SRP)
Open Source Code Yes Code is available at https://github.com/hzc1208/ANN2SNN SRP.
Open Datasets Yes In this section, we evaluate the performance of our methods for image classification tasks on CIFAR-10 (Le Cun et al. 1998), CIFAR-100 and Image Net (Deng et al. 2009) datasets under the network architecture of Res Net-18, Res Net-20, Res Net-34 (He et al. 2016) and VGG-16 (Simonyan and Zisserman 2014).
Dataset Splits No The paper mentions training and testing on datasets like CIFAR-10, CIFAR-100, and ImageNet, which typically have standard splits. However, it does not explicitly state the training/validation/test split percentages, sample counts, or specific methodology for splitting the data.
Hardware Specification No No specific hardware details (e.g., GPU models, CPU, memory, or cloud instance types) used for running experiments were mentioned.
Software Dependencies No The paper mentions using 'Stochastic Gradient Descent' and 'cosine decay scheduler' and cites related works, but does not specify any software names with version numbers (e.g., Python 3.x, PyTorch 1.x, TensorFlow 2.x).
Experiment Setup Yes The training procedure of ANNs is consistent with (Bu et al. 2022b). We replace Re LU activation layers with QCFS and select average-pooling as the pooling layers. We set λl in each activation layer as trainable threshold and vl(0) = 1/2θl in IF Neuron. For CIFAR-10, we set L = 4 on all network architectures. For CIFAR-100, we set L = 4 for VGG-16 and L = 8 for Res Net-20. For Image Net, we set L = 16 for VGG-16 and L = 8 for Res Net-34. We use Stochastic Gradient Descent (Bottou 2012) as our training optimizer and cosine decay scheduler (Loshchilov and Hutter 2017) to adjust the learning rate. We set the momentum parameter as 0.9. The initial learning rate is set as 0.1 (CIFAR-10/Image Net) or 0.02 (CIFAR-100) and weight decay is set as 5e-4 (CIFAR-10/100) or 1e-4 (Image Net). In addition, we also use common data normalization and data enhancement techniques (De Vries and Taylor 2017; D Cubuk et al. 2019; Li et al. 2021a) for all datasets. During the procedure of adopting SRP, for CIFAR-10/100, we recommend τ = 4 for all network architecture. For Image Net, we recommend τ = 14 for VGG-16 and τ = 8 for Res Net-34.