Dual Adversarial Graph Neural Networks for Multi-label Cross-modal Retrieval
Authors: Shengsheng Qian, Dizhan Xue, Huaiwen Zhang, Quan Fang, Changsheng Xu2440-2448
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive experiments conducted on two cross-modal retrieval benchmark datasets, NUS-WIDE and MIRFlickr, indicate the superiority of DAGNN. |
| Researcher Affiliation | Academia | 1National Lab of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences 2University of Chinese Academy of Sciences 3Peng Cheng Laboratory |
| Pseudocode | No | The paper does not contain a clearly labeled pseudocode block or algorithm. |
| Open Source Code | No | The paper does not provide any explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | NUS-WIDE: we randomly pick up 2,000 image-text pairs as the testing set and the rest as the training set. MIRFlickr: 2,000 image-text pairs are randomly selected as the testing set and the rest are used for training. |
| Dataset Splits | No | The paper specifies training and testing sets, but does not explicitly mention a separate validation set or its size/proportion for hyperparameter tuning. It states "we validate the hyper-parameters α and β" but does not link this to a specific validation split. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware (e.g., GPU model, CPU type) used for running the experiments. |
| Software Dependencies | No | The paper mentions "implemented on Pytorch deep learning framework" but does not provide a specific version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | The batch size m is set as 1024 for NUS-WIDE and 100 for MIRFlickr. The initial learning rates of the optimizer are 0.00005 on both datasets. ... we validate the hyper-parameters α and β and finally set α = 0.2, β = 0.2 for both datasets. ... The multi-hop graph neural networks consist of five GAT layers on NUS-WIDE and four GAT layers on MIRFlickr together with one aggregation layer, in which the output dimensionality of each GAT layer and aggregation layer is 1,024. |