Predicting Emotions in User-Generated Videos

Authors: Yu-Gang Jiang, Baohan Xu, Xiangyang Xue

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Results of a comprehensive set of experiments indicate that combining multiple types of features such as the joint use of the audio and visual clues is important, and attribute features such as those containing sentiment-level semantics are very effective. We now introduce experimental settings and discuss the results. In addition to using the entire dataset of eight emotion categories, we also discuss results on a subset of four emotions (Anger , Fear , Joy , and Sadness ), which have been more frequently adopted in the existing works. For both the entire set and the subset, we randomly generate ten train-test splits, each using 2/3 of the data for training and 1/3 for testing. A model is trained on each split for each emotion, and we report the mean and standard-deviation of the ten prediction accuracies, which are measured as the proportions of the test samples with correctly assigned emotion labels.
Researcher Affiliation Academia Yu-Gang Jiang, Baohan Xu, Xiangyang Xue School of Computer Science, Fudan University, Shanghai, China Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, China
Pseudocode No The paper describes the system pipeline but does not include any structured pseudocode or algorithm blocks.
Open Source Code No The paper states that a dataset is publicly available but does not provide access to the source code for the methodology described.
Open Datasets Yes To prompt research on this interesting and important problem, we first construct and publicly release a benchmark dataset1 based on videos downloaded from You Tube and Flickr (see Figure 1 for several example frames). 1Available at www.yugangjiang.info/research/Video Emotions/.
Dataset Splits No The paper mentions 'train-test splits' but does not specify a separate 'validation' split for hyperparameter tuning or model selection.
Hardware Specification No The paper does not provide any specific hardware details used for running its experiments.
Software Dependencies No The paper mentions SVM and types of kernels, but does not provide specific software names with version numbers (e.g., Python version, library versions).
Experiment Setup Yes We adopt the popular SVM due to its outstanding performance in many visual recognition tasks. For the kernel option of the SVM, we adopt the χ2 RBF kernel for all the bag-of-words representations, because it is particularly suitable for histogram-like features. The standard Gaussian RBF kernel is used for the remaining features. We follow the one-against-all strategy to train a separate classifier for each category, and a test sample is assigned to the category with the highest prediction score. Equal fusion weights are used in our experiments for simplicity.