Dynamically Visual Disambiguation of Keyword-based Image Search

Authors: Yazhou Yao, Zeren Sun, Fumin Shen, Li Liu, Limin Wang, Fan Zhu, Lizhong Ding, Gangshan Wu, Ling Shao

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate the superiority of our proposed approach.
Researcher Affiliation Collaboration 1Nanjing University of Science and Technology, Nanjing, China 2Inception Institute of Artificial Intelligence, Abu Dhabi, UAE 3University of Electronic Science and Technology of China, Chengdu, China 4Nanjing University, Nanjing, China
Pseudocode No The paper describes its methods through text and diagrams (Fig. 2 and Fig. 3) but does not include any explicit pseudocode or algorithm blocks.
Open Source Code No The paper does not contain any explicit statement about releasing open-source code for the described methodology, nor does it provide any links to a code repository.
Open Datasets Yes Two widely used polysemy datasets CMU-Polysemy-30 [Chen, 2015] and MIT-ISD [Saenko, 2009] are employed to validate the proposed framework.
Dataset Splits No For the main model training and evaluation, the paper states, 'we exploit web images as the training set, human-labeled images in CMU-Polysemy-30 and MIT-ISD as the testing set.' While a split is mentioned for an intermediate step ('The collected 100 images for each selected text query were randomly split into a training set and testing set (e.g., Im = {It m = 50, Iv m = 50} and In = {It n = 50, Iv n = 50})'), a distinct validation set for the primary model training is not explicitly described.
Hardware Specification No The paper does not provide specific hardware details such as GPU/CPU models, processor types, or memory used for running the experiments.
Software Dependencies No The paper mentions using a 'linear SVM classifier' and deep learning models like 'VGG-16' and 'Alex Net', but it does not specify any software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes DMIL is trained for 100 epochs with an initial learning rate selected from [0.0001, 0.002]. In addition, parameters for text query selection are specified: 'α is selected from {0.2, 0.4, 0.5, 0.6, 0.8} and β is selected from {10, 20, 30, 40, 50} in (2).', 'γn is set γn 0.5 in (4).', 'The value of I(tq) is selected from {1, 2, 3, 4, 5, 6, 7, 8, 9}.', and 'N is selected from {10, 20, 30, 40, 50, 60}.'