CausVSR: Causality Inspired Visual Sentiment Recognition
Authors: Xinyue Zhang, Zhaoxia Wang, Hailing Wang, Jing Xiang, Chunwei Wu, Guitao Cao
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments, conducted on four widely-used datasets, demonstrate Caus VSR s superiority in enhancing emotion perception within VSR, surpassing existing methods. |
| Researcher Affiliation | Academia | 1Shanghai Institute of Artificial Intelligence for Education, East China Normal University 2Mo E Engineering Research Center of SW/HW Co-design Technology and Application, East China Normal University 3Shanghai Key Laboratory of Trustworthy Computing, East China Normal University 4School of Computing and Information Systems, Singapore Management University |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not include an unambiguous statement about releasing source code or a link to a repository. |
| Open Datasets | Yes | We conducted experiments on four widely used VSR datasets, which include a large-scale dataset, Flickr and Instagram (FI-8) [You et al., 2016], and three small-scale datasets: Emotion ROI (6 classes) [Panda et al., 2018], IAPS-Subset (2 classes) [Machajdik and Hanbury, 2010], and Twitter II (2 classes) [Borth et al., 2013]. |
| Dataset Splits | No | The paper mentions using 'widely used VSR datasets' but does not explicitly provide specific train/validation/test split percentages, sample counts, or a detailed splitting methodology for these datasets within the paper. |
| Hardware Specification | Yes | All experiments are conducted on Nvidia Tesla P100-PCIE with a total memory capacity of 16 GB. |
| Software Dependencies | No | The paper mentions 'Py Torch framework' but does not specify a version number or other software dependencies with version numbers. |
| Experiment Setup | Yes | The training input images are standardized to a size of 448 448. To diversify the training data, we employ random resized cropping initially, followed by random horizontal flips. For model training, we utilize Stochastic Gradient Descent (SGD) as the optimization algorithm, with momentum decay and weight decay set to 0.9 and 5E 4, respectively, for improved computational efficiency. The initial learning rate is set at 1E 4 and is reduced by a factor of 100 every 10 iterations. λ1, λ2, λ3 are the alternatively balanced parameters, which are experientially set on Res2Net-101 to 1, 0.5, and 1. |