VistaNet: Visual Aspect Attention Network for Multimodal Sentiment Analysis

Authors: Quoc-Tuan Truong, Hady W. Lauw305-312

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on restaurant reviews showcase the effectiveness of visual aspect attention, visa-vis visual features or textual attention.
Researcher Affiliation Academia Quoc-Tuan Truong, Hady W. Lauw School of Information Systems Singapore Management University qttruong.2017@smu.edu.sg hadywlauw@smu.edu.sg
Pseudocode No The paper includes architectural diagrams and mathematical equations, but no explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Vista Net is implemented using Tensor Flow4. https://github.com/Preferred AI/vista-net
Open Datasets No We use a dataset of online reviews crawled from the Food and Restaurants categories of Yelp.com, covering 5 different major US cities... The datasets and codes used in this submission will be released publicly upon publication.
Dataset Splits Yes We keep the number of examples balanced across classes, and split 80% of the data for training, 5% for validation and 15% for test.
Hardware Specification No The paper mentions software like TensorFlow and pre-trained models like VGG-16, but does not specify any CPU, GPU, or memory details used for running the experiments.
Software Dependencies No The paper mentions 'NLTK' and 'Tensor Flow4', but does not provide specific version numbers for these software components.
Experiment Setup Yes GRU cells are 50-dimensional for word and sentence encoding, (100dimensional due to bidirectional RNN). Context vectors U, V and K are also 100-dimensional for the attention spaces of word, sentence, and document. In training, we use RMSprop (Tieleman and Hinton 2012) for gradient based optimization with a mini-batch size of 32.