Video Covariance Matrix Logarithm for Human Action Recognition in Videos

Authors: Piotr Bilinski, Francois Bremond

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Then, we present an extensive evaluation of the proposed VCML descriptor with the Fisher vector encoding and the Support Vector Machines on four challenging action recognition datasets. We show that the VCML descriptor achieves better results than the state-of-the-art appearance descriptors.
Researcher Affiliation Academia Piotr Bilinski and Francois Bremond INRIA Sophia Antipolis, STARS team 2004 Route des Lucioles, BP93, 06902 Sophia Antipolis, France {Piotr.Bilinski,Francois.Bremond}@inria.fr
Pseudocode No The paper includes figures describing processes (e.g., Figure 1: Overview of the video frame descriptor calculation) and mathematical equations, but no explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code No The paper does not include any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We present an evaluation, comparison and analysis of the proposed VCML descriptor and action recognition approach on 4 state-of-the-art action recognition datasets: URADL, MSR Daily Activity 3D, UCF50 and HMDB51. In all the experiments we follow the recommended evaluation protocols provided by the authors of the dataset.
Dataset Splits Yes We use the leave-one-person-out cross-validation evaluation scheme to report the performance. The UCF50 dataset [...] Videos are divided into 25 folds and we follow the recommended 25-folds crossvalidation to report the performance. The HMDB51 dataset [...] We use 3 train-test splits provided by the authors of this dataset and we report average accuracy over the 3 splits. We set the number of Gaussians (i.e. the codebook size) using the leave-one-person-out cross-validation for the URADL and the MSR datasets, leave-one-fold-out cross-validation for the UCF50, and 5-folds cross-validation for the HMDB51.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions software components like 'linear Support Vector Machines (SVMs)' and cites 'LIBSVM: A library for support vector machines', but it does not specify version numbers for any software or libraries used in the experiments.
Experiment Setup Yes To estimate the GMM parameters for the Fisher vector encoding, we randomly sample a subset of 100k features from the training set. We consider 6 various codebook sizes K = {2i}9 i=4 for the URADL and the MSR datasets and 4 codebook sizes K = {2i}7 i=4 for the UCF50 and the HMDB51 datasets. To increase precision, we initialize the GMM ten times and we keep the codebook with the lowest error.