VistaNet: Visual Aspect Attention Network for Multimodal Sentiment Analysis
Authors: Quoc-Tuan Truong, Hady W. Lauw305-312
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on restaurant reviews showcase the effectiveness of visual aspect attention, visa-vis visual features or textual attention. |
| Researcher Affiliation | Academia | Quoc-Tuan Truong, Hady W. Lauw School of Information Systems Singapore Management University qttruong.2017@smu.edu.sg hadywlauw@smu.edu.sg |
| Pseudocode | No | The paper includes architectural diagrams and mathematical equations, but no explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | Vista Net is implemented using Tensor Flow4. https://github.com/Preferred AI/vista-net |
| Open Datasets | No | We use a dataset of online reviews crawled from the Food and Restaurants categories of Yelp.com, covering 5 different major US cities... The datasets and codes used in this submission will be released publicly upon publication. |
| Dataset Splits | Yes | We keep the number of examples balanced across classes, and split 80% of the data for training, 5% for validation and 15% for test. |
| Hardware Specification | No | The paper mentions software like TensorFlow and pre-trained models like VGG-16, but does not specify any CPU, GPU, or memory details used for running the experiments. |
| Software Dependencies | No | The paper mentions 'NLTK' and 'Tensor Flow4', but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | GRU cells are 50-dimensional for word and sentence encoding, (100dimensional due to bidirectional RNN). Context vectors U, V and K are also 100-dimensional for the attention spaces of word, sentence, and document. In training, we use RMSprop (Tieleman and Hinton 2012) for gradient based optimization with a mini-batch size of 32. |