Dual Adversarial Networks for Zero-shot Cross-media Retrieval
Authors: Jingze Chi, Yuxin Peng
IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on three widely-used cross-media retrieval datasets show the effectiveness of our approach. |
| Researcher Affiliation | Academia | Institute of Computer Science and Technology, Peking University, Beijing, China |
| Pseudocode | No | The paper describes the training procedure and model architecture in text and equations, but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | No | The paper states 'We adopt Tensor Flow 1 to implement our model' and cites 'www.tensorflow.org', but does not explicitly state that the source code for their DANZCR method is open-source or provide a link to it. |
| Open Datasets | Yes | Wikipedia dataset [Rasiwasia et al., 2010] is widely used for cross-media retrieval evaluation. Pascal Sentence dataset [Farhadi et al., 2010] is selected from 2008 PASCAL development kit. NUS-WIDE dataset [Chua et al., 2009] consists of about 270,000 images with their tags categorized into 81 categories. |
| Dataset Splits | No | The paper explicitly describes training and testing sets for all datasets (e.g., '2,173 pairs are selected as training set and 693 pairs are selected as testing set') and a further division into 'seen category set' and 'unseen category set', but does not mention a 'validation set' or 'development set'. |
| Hardware Specification | No | The paper mentions using TensorFlow for implementation but does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper states 'We adopt Tensor Flow 1 to implement our model' but does not provide version numbers for other key software components or libraries like Word2Vec, VGGNet, or Doc2Vec. |
| Experiment Setup | Yes | We adopt Tensor Flow 1 to implement our model with a base learning rate 1 4 and dropout probability 0.9. The parameters λF and λR are set to 1 2. ... The forward generative models with three fully-connected layers are adopted for both image and text to generate common representations with each layer following a Re LU layer and a dropout layer except the last. The number of hidden units are 4,096, 4,096 and 300. The reverse generative models of both image and text are composed of three fully-connected layers to reconstruct image and text representations, with the 4,096 hidden units for the first two layers. The forward and reverse discriminative model have similar structure of three fully-connected layers with 4,096, 2,048 and 1 hidden units... |