Cross-media Multi-level Alignment with Relation Attention Network

Authors: Jinwei Qi, Yuxin Peng, Yuxin Yuan

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on 2 cross-media datasets, and compare with 10 state-of-the-art methods to verify the effectiveness of proposed approach.
Researcher Affiliation Academia Jinwei Qi, Yuxin Peng and Yuxin Yuan Institute of Computer Science and Technology, Peking University, Beijing 100871, China pengyuxin@pku.edu.cn
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that the source code for their methodology is publicly available.
Open Datasets Yes Flickr-30K dataset [Young et al., 2014]... MS-COCO dataset [Lin et al., 2014]
Dataset Splits Yes Following [Peng et al., 2017; Tran et al., 2016], there are 1,000 pairs in testing set and 1,000 pairs for validation, while the rest are for training.
Hardware Specification No The paper does not specify any hardware details (e.g., CPU, GPU models) used for running the experiments.
Software Dependencies No Our proposed CRAN approach is implemented by Torch. (No version number for Torch or other software dependencies are provided.)
Experiment Setup Yes The length of sequence is set as 201. There are three convolutional layers in Char-CNN, and the parameter combinations are (384, 4), (512, 4) and (2048, 4). The outputs of Char-CNN are processed by an LSTM network. Their output dimension is 2048. ...all the margins α in loss functions are set to 1. We set K = 3 for local and relation alignment in cross-media similarity measurement. The learning rate of our proposed approach is decreased by a half each 50 epochs, while it is initialized as 0.0004.