Action-Aware Embedding Enhancement for Image-Text Retrieval

Authors: Jiangtong Li, Li Niu, Liqing Zhang1323-1331

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The effectiveness of our proposed AME method is verified by comprehensive experimental results on two benchmark datasets.
Researcher Affiliation Academia Jiangtong Li, Li Niu , Liqing Zhang , Mo E Key Lab of Artificial Intelligence, Department of Computer Science and Engineering, Shanghai Jiao Tong University {keep moving-lee, ustcnewly}@sjtu.edu.cn, zhang-lq@cs.sjtu.edu.cn
Pseudocode No The paper describes the methodology in text and with a flowchart (Figure 2), but does not contain a formal pseudocode or algorithm block.
Open Source Code No No explicit statement about releasing source code or a direct link to a code repository for the described methodology was found.
Open Datasets Yes We evaluate our AME method and all the other baselines on two large-scale benchmark datasets: Flickr30K (Young et al. 2014) and Microsoft COCO (Lin et al. 2014).
Dataset Splits Yes We follow the split in (Lee et al. 2018), by using 1,000 images for validation, 1,000 images for testing, and 29,000 images for training. Microsoft COCO (Lin et al. 2014) ... Following the split in (Lee et al. 2018), we select 5,000 validation images and 5,000 test images from the original validation set and then add the rest 30,504 images into training set.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned in the paper.
Software Dependencies No The paper mentions several software components like 'Stanford Core NLP model', 'pre-trained BERT', 'Faster R-CNN model', 'Bi-GRU', and 'GloVe', but does not provide specific version numbers for these or other software dependencies.
Experiment Setup No The paper states 'More details about implementation can be found in Supplementary.' and does not provide specific hyperparameters like learning rate, batch size, or optimizer settings in the main text.