Dependency Exploitation: A Unified CNN-RNN Approach for Visual Emotion Recognition

Authors: Xinge Zhu, Liang Li, Weigang Zhang, Tianrong Rao, Min Xu, Qingming Huang, Dong Xu

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on both Internet images and art photo datasets demonstrate that our method outperforms the state-of-the-art methods with at least 7% performance improvement.
Researcher Affiliation Academia University of Chinese Academy of Sciences, China; Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS, China; Harbin Institute of Technology, Weihai, China; University of Technology, Sydney, Australia; University of Sydney, Australia
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Our model and results are available online1. 1https://github.com/WERush/Unified_CNN_RNN
Open Datasets Yes The large scale emotion dataset is recently published in [You et al., 2016]... We use the labeled dataset and the same training/testing split as in [Rao et al., 2016b] to evaluate these methods. ... The Art Photo dataset [Machajdik and Hanbury, 2010]... In [Mikels et al., 2005], 395 images are collected from the standard IAPS dataset and labeled with arousal and valence values, which formed the IAPS-Subset dataset.
Dataset Splits Yes Specifically, the dataset is randomly split into a training set (80%, 18,532 images), a testing set (15%, 3,474 images) and a validation set (5%, 1,158 images).
Hardware Specification Yes Our model is implemented by using Torch7 [Collobert et al., 2011] on one Nvidia GTX Titan X.
Software Dependencies No The paper mentions "Torch7" but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes We set λ = 0.5 in E.q (11) to balance the loss function and the regularization term, and set margin µ = 1. The batch size is set to 64, and the CNN part is optimized by using the SGD with learning rate = 0.001 and Bi GRU is optimized by using Rmsprop [Tieleman and Hinton, 2012] with the learning rate as 0.0001. In addition, a staircase weight decay is applied after 10 epoches.