HAL: Improved Text-Image Matching by Mitigating Visual Semantic Hubs
Authors: Fangyu Liu, Rongtian Ye, Xun Wang, Shuaipeng Li11563-11571
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experiment our method with various configurations of model architectures and datasets. The method exhibits exceptionally good robustness and brings consistent improvement on the task of text-image matching across all settings. |
| Researcher Affiliation | Collaboration | Fangyu Liu,1 Rongtian Ye,2 Xun Wang,3 Shuaipeng Li4 1University of Cambridge, Cambridge, UK 2Aalto University, Espoo, Finland 3Malong Technologies, Shenzhen, China 4Sense Time Research, Beijing, China |
| Pseudocode | No | The paper describes the loss functions mathematically and conceptually but does not provide a pseudocode block or algorithm. |
| Open Source Code | Yes | Our code is released at: https://github.com/hardyqr/HAL. |
| Open Datasets | Yes | We use MS-COCO (Lin et al. 2014) and Flickr30k (Young et al. 2014) as our experimental datasets. |
| Dataset Splits | Yes | For MSCOCO... 113,287 images for training, 5,000 for validation and 5,000 for testing. Flickr30k has 30,000 images for training; 1,000 for validation; 1,000 for testing. |
| Hardware Specification | Yes | We do not include HAL+MB for (Vendrov et al. 2016) as it demands GPU memory exceeding 11GB, which is the limit of our used GTX 2080Ti. |
| Software Dependencies | No | The paper mentions software components like GRU, ResNet152, Inception-ResNet-v2, and VGG19, but does not specify their version numbers or the versions of any underlying programming languages or libraries (e.g., PyTorch, TensorFlow, Python). |
| Experiment Setup | Yes | For more details about hyperparameters and training configurations please refer to Table 3 and code release: https://github.com/hardyqr/HAL. Table 3 lists specific hyperparameters such as 'margin=0.2, lr=0.001, lr update=10, bs=128, epoch=30', 'γ=30, ϵ=0.3', 'α=40, β=40, ϵ1=0.2, ϵ2=0.1'. |