HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors

Authors: Xiao Wang, Zongzhen Wu, Bo Jiang, Zhimin Bao, Lin Zhu, Guoqi Li, Yaowei Wang, Yonghong Tian

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate and report the performance of multiple popular HAR algorithms, which provide extensive baselines for future works to compare. More importantly, we propose a novel spatial-temporal feature learning and fusion framework, termed ESTF, for event stream based human activity recognition. ... Extensive experiments on multiple datasets fully validated the effectiveness of our model.
Researcher Affiliation Collaboration 1School of Computer Science and Technology, Anhui University, Hefei 230601, China, 2Tencent, 3Beijing Institute of Technology, 4University of Chinese Academy of Sciences, 5Peng Cheng Laboratory, China, 6National Key Laboratory for Multimedia Information Processing, School of Computer Science, Peking University, China, 7School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University, China.
Pseudocode No The paper describes the methodology using text and equations but does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Both the dataset and source code will be released at https://github.com/Event AHU/HARDVS.
Open Datasets Yes Both the dataset and source code will be released at https://github.com/Event AHU/HARDVS. ... N-Caltech101 (Orchard et al. 2015), ASL-DVS (Bi et al. 2020), and our newly proposed HARDVS.
Dataset Splits Yes We split 60%, 10%, and 30% of each category for training, validating, and testing, respectively. Totally, the number of videos in the training, validating, and testing subset is 64526|10734|32386, respectively.
Hardware Specification Yes Our model spends 25 ms for each video (8 frames used) in our proposed HARDVS dataset on a server with GPU RTX-3090.
Software Dependencies No The paper mentions using "toolkit ptflops" but does not specify a version number for it or any other key software dependencies for their implementation.
Experiment Setup Yes In this work, we set T = 8 in our experiments. ...Stem Net (the Res Net-18 (He et al. 2016) is selected in our experiments)... The standard cross-entropy loss function is adopted... Analysis on Number of Input Frames... Analysis on Split Patches of Spatial Data... Analysis on Layers of Transformer Layers...