Brute-Force Facial Landmark Analysis With a 140,000-Way Classifier

Authors: Mengtian Li, Laszlo Jeni, Deva Ramanan

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform a comprehensive experimental analysis of our method on standard benchmarks, demonstrating state-of-the-art results for facial alignment in videos.
Researcher Affiliation Academia Mengtian Li, Laszlo Jeni, Deva Ramanan The Robotics Institute, Carnegie Mellon University {mtli, laszlojeni, deva}@cmu.edu
Pseudocode No The paper describes algorithms but does not contain structured pseudocode or clearly labeled algorithm blocks.
Open Source Code No The results on the entire video can be found on the author s website. Two demo videos can be found on the author s website showing our interactive annotation in progress.
Open Datasets Yes We test our algorithms on the 300VW dataset (Shen et al. 2015), a standard benchmark for video face alignment. ... we include into our training set images from 300W (Sagonas et al. 2013), IBUG, HELEN, LFPW, AFW. Moreover, we include synthetic large pose dataset 300WLP (Zhu et al. 2016). ... The detector is trained on the Celeb A (Liu et al. 2015) and the WIDER FACE datasets (Yang et al. 2016).
Dataset Splits Yes To form the validation set, we randomly pick 10% of the training videos. For the remaining 59 training videos, we subsample 10% of the frame at uniform interval to remove data correlation. This forms our base training set.
Hardware Specification No Our work uses a single GPU in MATLAB and handles 40% more classes. ... the 12GB memory available on current graphics cards.
Software Dependencies No Our MATLAB code takes around 60ms per frame, including detection refinement and postprocessing regressors.
Experiment Setup Yes For this experiment, we use the exemplar-class, since we find the more classes, the lower error the model will predict. The membership set threshold τ is determined through validation. ... For the post-processing fine-tuning, we train 100 pose class regressors with 7 levels of cascades. For temporal smoothing, we use a low pass filter on 3 consecutive frames.