Uncertainty Sampling for Action Recognition via Maximizing Expected Average Precision

Authors: Hanmo Wang, Xiaojun Chang, Lei Shi, Yi Yang, Yi-Dong Shen

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on three real-world action recognition datasets and show that our algorithm outperforms other uncertainty-based active learning algorithms.
Researcher Affiliation Academia 1 State Key Lab. of Computer Science, Institute of Software, Chinese Academy of Sciences, China 2 University of Chinese Academy of Sciences, Beijing 100049, China 3 School of Computer Science, Carnegie Mellon University, Pittsburgh, USA 4 Centre for Artificial Intelligence, University of Technology Sydney, Sydney, Australia
Pseudocode Yes Algorithm 1 USAP Input: number of selected videos k, number of classes c, number of unlabeled videos n, probability estimate p [0, 1]n c, unlabeled video set U Output: selected video set S 1: φ 0 2: for i=1 to c do 3: for j=1 to n do 4: p p i\{pji}% the i-th column of p without pji 5: Sort p in descending order 6: Calculate g( , ) using Eq. (17) and p 7: Calculate f( , ) using Eq. (18) and p 8: Calculate AP V using Eq. (14) 9: φj φj + AP V 10: end for 11: end for 12: Select S U corresponding to the k largest φ value 13: Return: S
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes The HMDB51 dataset [Kuehne et al., 2011] has 51 action classes and 6766 video clips extracted from digitized movies and You Tube. The Hollywood2 dataset [Marszalek et al., 2009] contains 12 action classes and 1707 video clips that are collected from 69 different Hollywood movies. The UCF50 dataset [Reddy and Shah, 2013] has 50 action classes spanning over 6618 You Tube videos clips that can be split into 25 groups.
Dataset Splits No The paper mentions 'official training set', 'official testing set', and initial 'labeled dataset', and refers to 'train-test splits' and 'pre-defined split of 823 training videos and 884 test videos', but does not explicitly define a separate 'validation' split.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper mentions 'Logistic Regression' and 'level-three MIFS feature extractor' but does not specify any software names with version numbers for reproducibility.
Experiment Setup Yes For each video clip, a level-three MIFS [Lan et al., 2015] feature extractor is used to extract fixed-length feature, and Logistic Regression with parameter C = 100 is used as the underlying linear classifier to conduct onevs-rest classification. ... we randomly select 10 videos of each class as initial labeled dataset. ... We iteratively select c videos until the labeling budget is reached. ... we use the fast screen rule in Section 3.4 to filter out uninformative video samples so that the size of the unlabeled pool does not exceed 2000.