Image-to-Image Retrieval by Learning Similarity between Scene Graphs
Authors: Sangwoong Yoon, Woo Young Kang, Sungwook Jeon, SeongEun Lee, Changjin Han, Jonghun Park, Eun-Sol Kim10718-10726
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | we propose a novel approach for image-to-image retrieval using scene graph similarity measured by graph neural networks. In our approach, graph neural networks are trained to predict the proxy image relevance measure, computed from humanannotated captions using a pre-trained sentence similarity model. We collect and publish the dataset for image relevance measured by human annotators to evaluate retrieval algorithms. The collected dataset shows that our method agrees well with the human perception of image similarity than other competitive baselines. |
| Researcher Affiliation | Collaboration | 1Seoul National University Robotics Lab, 2Kakao Brain, 3Seoul National University Information Management Lab |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our dataset is available online1. [Footnote 1]: https://github.com/swyoon/aaai2021-scene-graph-img-retr |
| Open Datasets | Yes | The first dataset is the intersection of the Visual Genome (Krishna et al. 2017) and MS-COCO (Lin et al. 2014), which we will refer to as VG-COCO. ... The second dataset is Flickr30K (Plummer et al. 2017), where five captions are provided per an image. |
| Dataset Splits | Yes | Flickr30K contains 30,000 training images, 1,000 validation images, and 1,000 testing images. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, processor types, memory amounts) used for running experiments are provided in the paper. |
| Software Dependencies | No | The paper mentions software like 'Sentence-BERT' and 'Adam optimizer' but does not provide specific version numbers for these or other key software components (e.g., Python, PyTorch, TensorFlow, CUDA) needed for replication. |
| Experiment Setup | Yes | We use Adam optimizer with the initial learning rate of 0.0001. We multiply 0.9 to the learning rate every epoch. We set batch size as 32, and models are trained for 25 epochs. |