Web-Based Semantic Fragment Discovery for On-Line Lingual-Visual Similarity

Authors: Xiaoshuai Sun, Jiewei Cao, Chao Li, Lei Zhu, Heng Tao Shen

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on semantic fragment quality assessment, sentence-based image retrieval, automatic multimedia insertion and ordering demonstrated the effectiveness of the proposed framework.
Researcher Affiliation Academia 1The University of Queensland, Brisbane 4067, Australia. 2Harbin Institute of Tehcnology, Heilongjiang 150001, China. 3University of Electronic Science and Technology of China, Chengdu 611731, China.
Pseudocode No The paper describes the methods and formulas (e.g., Q(g) and f̂g pdf(f)) but does not present them in a formally structured pseudocode or algorithm block.
Open Source Code Yes We make code, datasets and annotations publicly available on our project page.
Open Datasets Yes We release two new datasets as extensions of Flickr30K (Young et al. 2014) to enable research and comparisons on Web-based unsupervised sentence understanding tasks. Flickr30K-Phrase This dataset consists of 3.2 million images with 32,486 weak phrase labels. Flickr30K-Quality We sample 20K images with 1K phrase labels from Flickr30K-Phrase...
Dataset Splits No We adopt the 1K test images in Flickr30K for quantitative evaluation.
Hardware Specification No The paper describes the use of VGG-16 and Mat Conv Net Toolkit but does not provide any specific details about the hardware (e.g., GPU model, CPU type, memory) used for running the experiments.
Software Dependencies No Practically, we adopt Mat Conv Net Toolkit (Vedaldi and Lenc 2015) with pre-trained model of VGG-16 (Simonyan and Zisserman 2014) for image feature extraction.
Experiment Setup Yes We set the size of each fragment N = 20, and fix the self-similarity threshold to 0.2 in all the tests.