ProCC: Progressive Cross-Primitive Compatibility for Open-World Compositional Zero-Shot Learning

Authors: Fushuo Huo, Wenchao Xu, Song Guo, Jingcai Guo, Haozhao Wang, Ziming Liu, Xiaocheng Lu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on three widely used benchmark datasets demonstrate that our method outperforms other representative methods on both OW-CZSL and p CZSL settings by large margins.
Researcher Affiliation Academia 1Department of Computing, The Hong Kong Polytechnic University, Hong Kong SAR 2The Hong Kong Polytechnic University Shenzhen Research Institute, Shenzhen, China 3Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR 4School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan, China
Pseudocode Yes For detailed training procedure, please refers to Appendix: Algorithm 1.
Open Source Code Yes 1Codes is in https://github.com/huofushuo/procc and appendix is in https:// arxiv.org/abs/2211.12417
Open Datasets Yes We conduct experiments on three widely-use datasets including UT-Zappos (Yu and Grauman 2014), MIT-States (Isola, Lim, and Adelson 2015), and C-GQA (Misra, Gupta, and Hebert 2017). Details of three datasets are listed in Appendix 1.
Dataset Splits Yes For the OW-CZSL, we follow the splits of (Mancini et al. 2021, 2022; Karthik, Mancini, and Akata 2022) and evaluate based on the generalized settings, where the test samples are from both seen and unseen compositions.
Hardware Specification No The paper does not specify the hardware used for the experiments (e.g., GPU model, CPU type).
Software Dependencies No The paper mentions 'Py Torch' but does not provide specific version numbers for software dependencies or libraries required for reproduction.
Experiment Setup Yes We use Py Torch to implement our network and optimize it with Adam (Kingma and Ba 2015) with default settings. The batch size is 256, and the learning rate is 5.0 10 5 for the first two stages and 1.0 10 5 for the third stage. For the UT-Zappos, MIT-States, and CGQA datasets, the total training time is approximately 1, 3, and 5 hours for 30/60/20, 40/80/30, and 50/100/25 epochs for three stages, respectively, with the early stop strategy.