reproducibilityindex.ai

High Performance Gesture Recognition via Effective and Efficient Temporal Modeling

Authors: Yang Yi, Feng Ni, Yuexin Ma, Xinge Zhu, Yuankai Qi, Riming Qiu, Shijie Zhao, Feng Li, Yongtao Wang

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments conducted on public datasets demonstrate that our proposed model achieves the state-of-the-art with higher efﬁciency. Moreover, the proposed MKTB and GRB are plug-and-play modules and the experiments on other tasks, like video understanding and video-based person reidentiﬁcation, also display their good performance in efﬁciency and capability of generalization.
Researcher Affiliation	Collaboration	Yang Yi1 , Feng Ni2 , Yuexin Ma3 , Xinge Zhu4 , Yuankai Qi5 , Riming Qiu1 , Shijie Zhao1 , Feng Li1 and Yongtao Wang2 1 Media Lab, Tencent 2Peking University 3University of Hong Kong 4The Chinese University of Hong Kong 5Harbin Institute of Technology, Weihai, China
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	Yes	The code will be available at https://github.com/nemonameless/Gesture-Recognition.
Open Datasets	Yes	Iso GD [Wan et al., 2016] is a large-scale multi-modality gesture dataset which contains 249 gesture classes. Jester[Twenty Bn, 2017] is a large collection of densely-labeled video clips of hand gestures. Something-Somthing-V1[Goyal et al., 2017] is a challenging dataset that shows basic actions with everyday objects. MARS[Zheng et al., 2016] is the largest video-based person re-identiﬁcation dataset.
Dataset Splits	Yes	This database is split into three sub-datasets: 35,878 videos for training, 5,784 videos for validation and 6,271 videos for testing.
Hardware Specification	Yes	The proposed networks are trained with Py Torch deep learning framework on GPUs of NVidia Tesla P40 with CUDA 8.0.
Software Dependencies	No	The paper mentions "Py Torch deep learning framework" and "CUDA 8.0" but does not specify the version number for PyTorch itself, which is a key software dependency.
Experiment Setup	Yes	Unless otherwise noted, we set temporal segments T = 8. Following data augmentation strategies of TSN [Wang et al., 2016], the frames are cropped and resized to 224 224 after aspect ratio jittering and scale jittering. For all experiments, we adopt mini-batch SGD to optimize the model with momentum of 0.9 and weight decay of 5e 4. We train for 60 epochs with cross entropy loss and batch size of 48. The learning rate is initialized as 0.01 and reduced by a factor of 10 every 20 epochs. Dropout layer with ratio of 0.5 is added before the classiﬁcation layer.