reproducibilityindex.ai

HAL: Improved Text-Image Matching by Mitigating Visual Semantic Hubs

Authors: Fangyu Liu, Rongtian Ye, Xun Wang, Shuaipeng Li11563-11571

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We experiment our method with various conﬁgurations of model architectures and datasets. The method exhibits exceptionally good robustness and brings consistent improvement on the task of text-image matching across all settings.
Researcher Affiliation	Collaboration	Fangyu Liu,1 Rongtian Ye,2 Xun Wang,3 Shuaipeng Li4 1University of Cambridge, Cambridge, UK 2Aalto University, Espoo, Finland 3Malong Technologies, Shenzhen, China 4Sense Time Research, Beijing, China
Pseudocode	No	The paper describes the loss functions mathematically and conceptually but does not provide a pseudocode block or algorithm.
Open Source Code	Yes	Our code is released at: https://github.com/hardyqr/HAL.
Open Datasets	Yes	We use MS-COCO (Lin et al. 2014) and Flickr30k (Young et al. 2014) as our experimental datasets.
Dataset Splits	Yes	For MSCOCO... 113,287 images for training, 5,000 for validation and 5,000 for testing. Flickr30k has 30,000 images for training; 1,000 for validation; 1,000 for testing.
Hardware Specification	Yes	We do not include HAL+MB for (Vendrov et al. 2016) as it demands GPU memory exceeding 11GB, which is the limit of our used GTX 2080Ti.
Software Dependencies	No	The paper mentions software components like GRU, ResNet152, Inception-ResNet-v2, and VGG19, but does not specify their version numbers or the versions of any underlying programming languages or libraries (e.g., PyTorch, TensorFlow, Python).
Experiment Setup	Yes	For more details about hyperparameters and training conﬁgurations please refer to Table 3 and code release: https://github.com/hardyqr/HAL. Table 3 lists specific hyperparameters such as 'margin=0.2, lr=0.001, lr update=10, bs=128, epoch=30', 'γ=30, ϵ=0.3', 'α=40, β=40, ϵ1=0.2, ϵ2=0.1'.