Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Multimodal Gesture Recognition via Multiple Hypotheses Rescoring
Authors: Vassilis Pitsikalis, Athanasios Katsamanis, Stavros Theodorakis, Petros Maragos
JMLR 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The overall approach achieves 93.3% gesture recognition accuracy in the Cha Learn Kinect-based multimodal data set, significantly outperforming all recently published approaches on the same challenging multimodal gesture recognition task, providing a relative error rate reduction of at least 47.6%. |
| Researcher Affiliation | Academia | Vassilis Pitsikalis EMAIL Athanasios Katsamanis EMAIL Stavros Theodorakis EMAIL Petros Maragos EMAIL National Technical University of Athens School of Electrical and Computer Engineering Zografou Campus, Athens 15773, Greece |
| Pseudocode | Yes | Algorithm 1 Multimodal Scoring and Resorting of Hypotheses Algorithm 2 Segmental Parallel Fusion |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | Yes | For our experiments we employ the Cha Learn multimodal gesture challenge data set, introduced by Escalera et al. (2013b). |
| Dataset Splits | Yes | The data set contains three separate sets, namely for development, validation and final evaluation, including 39 users and 13858 gesture-word instances in total. |
| Hardware Specification | Yes | For the measurements we employed an AMD Opteron(tm) Processor 6386 at 2.80GHz with 32GB RAM. |
| Software Dependencies | No | The paper mentions algorithms and methods like Hidden Markov Models (HMMs), Baum-Welch algorithm, Viterbi algorithm, and Gaussian mixture models (GMMs), and 'The HTK Book', but it does not specify any software packages or libraries with version numbers used for implementation. |
| Experiment Setup | Yes | For skeleton, we train left-right HMMs with 12 states and 2 Gaussians per state. For handshape, the models correspondingly have 8 states and 3 Gaussians per state while speech gesture models have 22 states and 10 Gaussians per state. ... N is chosen to be equal to 200. ... The best weight combination for the multimodal hypothesis rescoring component is found to be w SK,HS,AU = [63.6, 9.1, 27.3] ... the best combination of weights for the segmental fusion component is [0.6, 0.6, 98.8]. |