Spike-based Neuromorphic Model for Sound Source Localization

Authors: Dehao Zhang, Shuai Wang, Ammar Belatreche, Wenjie Wei, Yichen Xiao, Haorui Zheng, Zijian Zhou, Malu Zhang, Yang Yang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimentation demonstrates that our SSL framework achieves state-of-the-art accuracy in SSL tasks. Furthermore, it shows exceptional noise robustness and maintains high accuracy even at very low signal-to-noise ratios.
Researcher Affiliation Academia 1 University of Electronic Science and Technology of China 2 Northumbria University 3 Peking University
Pseudocode No The paper describes methods using mathematical equations and text, but does not provide structured pseudocode or algorithm blocks.
Open Source Code No Additionally, our code will be made available on subsequent after review.
Open Datasets Yes In this section, we evaluate our proposed spike-based SSL framework performance on three datasets: the HRTF [57], Single Words [30], and SLo Clas dataset [42].
Dataset Splits No The paper states it evaluates performance on datasets (HRTF, Single Words, SLo Clas) and describes metrics like MAE and accuracy with specific η values, but it does not explicitly provide the training, validation, and test dataset splits by percentage, sample counts, or reference to predefined splits in the text.
Hardware Specification No The paper discusses theoretical energy estimations using a 45nm technology assumption, but does not specify the actual hardware (e.g., GPU models, CPU types) used for running the experiments.
Software Dependencies No The paper does not list specific version numbers for software dependencies such as programming languages, libraries, or frameworks (e.g., Python version, PyTorch version).
Experiment Setup Yes Table 3: Experimental configuration of the sound localization task. Attributes Setup 1. Data preprocessing: Sampling rate (Hz) 16000 Frame length (ms) 170 Frame stride (ms) 170 RF neurons n 512 Number of Microphones 4 2. RF-PLC setting: CQT frequency range (Hz) [0, 8800] τ (ms) 0.0625 Frequency channels N 40 Coincidence detector Nτ 51 Microphone pairs C 6 3. SNN Hyperparameter: α 0.75 Timestep 4 Epochs 300 Batch size 128 Optimizer Adam Base learning rate 1e-3 Learning rate decay Cosine Weight decay 5e-3