reproducibilityindex.ai

DUET: Cross-Modal Semantic Grounding for Contrastive Zero-Shot Learning

Authors: Zhuo Chen, Yufeng Huang, Jiaoyan Chen, Yuxia Geng, Wen Zhang, Yin Fang, Jeff Z. Pan, Huajun Chen

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We find that DUET can achieve state-of-the-art performance on three standard ZSL benchmarks and a knowledge graph equipped ZSL benchmark, and that its components are effective and its predictions are interpretable.
Researcher Affiliation	Collaboration	Zhuo Chen1, 2, 6, Yufeng Huang3, 6, Jiaoyan Chen4, Yuxia Geng1, 6, Wen Zhang3, 6, Yin Fang1, 6, Jeff Z. Pan5, Huajun Chen1, 2, 6* 1College of Computer Science and Technology, Zhejiang University 2Donghai Laboratory, Zhoushan 316021, China 3School of Software Technology, Zhejiang University 4Department of Computer Science, The University of Manchester 5School of Informatics, The University of Edinburgh 6Alibaba-Zhejiang University Joint Institute of Frontier Technologies
Pseudocode	No	No pseudocode or clearly labeled algorithm block found in the paper.
Open Source Code	Yes	Our code is available at https://github.com/zjukg/DUET.
Open Datasets	Yes	We select three standard attribute equipped ZSL benchmarks AWA2 (Xian et al. 2019), CUB (Welinder et al. 2010), SUN (Patterson and Hays 2012) with their splits proposed in (Xian et al. 2019), as well as a knowledge graph (KG) equipped benchmark AWA2-KG which has the same split as AWA2 but includes semantic information about hierarchical classes and attributes, for evaluation.
Dataset Splits	Yes	We select three standard attribute equipped ZSL benchmarks AWA2 (Xian et al. 2019), CUB (Welinder et al. 2010), SUN (Patterson and Hays 2012) with their splits proposed in (Xian et al. 2019) and "γ is the calibration factor tuned on a held-out validation set".
Hardware Specification	No	No specific hardware details (e.g., GPU models, CPU types, or memory specifications) were found. The paper only mentions 'Vi T-base as the vision encoder' which is a software model.
Software Dependencies	No	The paper mentions software components like 'pre-trained language models (PLMs)', 'vision transformer', and 'Res Net', but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	For those coefficients in AWA2, we set λar to 0.01, λcon to 0.05, λcmr to 1, λacl to 0.01, rrap to 0.5, ρ to 0.4 and γ to 0.8.