HARDVS: Revisiting Human Activity Recognition with Dynamic Vision Sensors
Authors: Xiao Wang, Zongzhen Wu, Bo Jiang, Zhimin Bao, Lin Zhu, Guoqi Li, Yaowei Wang, Yonghong Tian
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate and report the performance of multiple popular HAR algorithms, which provide extensive baselines for future works to compare. More importantly, we propose a novel spatial-temporal feature learning and fusion framework, termed ESTF, for event stream based human activity recognition. ... Extensive experiments on multiple datasets fully validated the effectiveness of our model. |
| Researcher Affiliation | Collaboration | 1School of Computer Science and Technology, Anhui University, Hefei 230601, China, 2Tencent, 3Beijing Institute of Technology, 4University of Chinese Academy of Sciences, 5Peng Cheng Laboratory, China, 6National Key Laboratory for Multimedia Information Processing, School of Computer Science, Peking University, China, 7School of Electronic and Computer Engineering, Shenzhen Graduate School, Peking University, China. |
| Pseudocode | No | The paper describes the methodology using text and equations but does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Both the dataset and source code will be released at https://github.com/Event AHU/HARDVS. |
| Open Datasets | Yes | Both the dataset and source code will be released at https://github.com/Event AHU/HARDVS. ... N-Caltech101 (Orchard et al. 2015), ASL-DVS (Bi et al. 2020), and our newly proposed HARDVS. |
| Dataset Splits | Yes | We split 60%, 10%, and 30% of each category for training, validating, and testing, respectively. Totally, the number of videos in the training, validating, and testing subset is 64526|10734|32386, respectively. |
| Hardware Specification | Yes | Our model spends 25 ms for each video (8 frames used) in our proposed HARDVS dataset on a server with GPU RTX-3090. |
| Software Dependencies | No | The paper mentions using "toolkit ptflops" but does not specify a version number for it or any other key software dependencies for their implementation. |
| Experiment Setup | Yes | In this work, we set T = 8 in our experiments. ...Stem Net (the Res Net-18 (He et al. 2016) is selected in our experiments)... The standard cross-entropy loss function is adopted... Analysis on Number of Input Frames... Analysis on Split Patches of Spatial Data... Analysis on Layers of Transformer Layers... |