reproducibilityindex.ai

Knowledge Aware Semantic Concept Expansion for Image-Text Matching

Authors: Botian Shi, Lei Ji, Pan Lu, Zhendong Niu, Nan Duan

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments are conducted on Flickr30K and MSCOCO datasets, and prove that our model achieves state-of-the-art results due to the effectiveness of incorporating the external SCG.
Researcher Affiliation	Collaboration	1Beijing Institute of Technology 2Institute of Computing Technology, Chinese Academy of Science, Beijing, China 3Natural Language Computing, Microsoft Research Asia, Beijing, China 4University of California, Los Angeles
Pseudocode	Yes	Algorithm 1 Concept Expansion
Open Source Code	No	The paper provides a link for 'real showcases of retrieval' in the case study section, which are visualizations/results, not the general source code for the methodology. 'please check out this link: https://goo.gl/izcSN9.'
Open Datasets	Yes	Visual Genome [Krishna et al., 2017], MSCOCO [Lin et al., 2014], Flickr30K [Young et al., 2014]
Dataset Splits	Yes	MSCOCO: We follow [Karpathy and Fei-Fei, 2015] to prepare the training, validation and test dataset by splitting all images to 113,287 (for training), 5,000 (for validation) and 5,000 (for test). Flickr30K: We followed the split in [Karpathy and Fei-Fei, 2015] and [Faghri et al., 2017] that used 1,000 images for testing and 1,000 images for validation and the rest of them (28,783 images) for training.
Hardware Specification	No	The paper mentions models like LSTM and VGG19, but does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions using LSTM, VGG19, ImageNet, and Adam Optimizer, but does not provide specific version numbers for any software libraries or frameworks (e.g., PyTorch, TensorFlow, CUDA).
Experiment Setup	Yes	The dimension of concept-enhanced image representation and text representation is e = 512. We used λ1 = 5.0, λ2 = 1.0, λ3 = 1.5 and λ4 = 0.05 as the hyper-parameters of loss function. An Adam Optimizer was adopted to optimize model s parameters.