Dependency Exploitation: A Unified CNN-RNN Approach for Visual Emotion Recognition
Authors: Xinge Zhu, Liang Li, Weigang Zhang, Tianrong Rao, Min Xu, Qingming Huang, Dong Xu
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on both Internet images and art photo datasets demonstrate that our method outperforms the state-of-the-art methods with at least 7% performance improvement. |
| Researcher Affiliation | Academia | University of Chinese Academy of Sciences, China; Key Lab of Intell. Info. Process., Inst. of Comput. Tech., CAS, China; Harbin Institute of Technology, Weihai, China; University of Technology, Sydney, Australia; University of Sydney, Australia |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our model and results are available online1. 1https://github.com/WERush/Unified_CNN_RNN |
| Open Datasets | Yes | The large scale emotion dataset is recently published in [You et al., 2016]... We use the labeled dataset and the same training/testing split as in [Rao et al., 2016b] to evaluate these methods. ... The Art Photo dataset [Machajdik and Hanbury, 2010]... In [Mikels et al., 2005], 395 images are collected from the standard IAPS dataset and labeled with arousal and valence values, which formed the IAPS-Subset dataset. |
| Dataset Splits | Yes | Specifically, the dataset is randomly split into a training set (80%, 18,532 images), a testing set (15%, 3,474 images) and a validation set (5%, 1,158 images). |
| Hardware Specification | Yes | Our model is implemented by using Torch7 [Collobert et al., 2011] on one Nvidia GTX Titan X. |
| Software Dependencies | No | The paper mentions "Torch7" but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | We set λ = 0.5 in E.q (11) to balance the loss function and the regularization term, and set margin µ = 1. The batch size is set to 64, and the CNN part is optimized by using the SGD with learning rate = 0.001 and Bi GRU is optimized by using Rmsprop [Tieleman and Hinton, 2012] with the learning rate as 0.0001. In addition, a staircase weight decay is applied after 10 epoches. |