Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Attend and Enrich: Enhanced Visual Prompt for Zero-Shot Learning
Authors: Man Liu, Huihui Bai, Feng Li, Chunjie Zhang, Yunchao Wei, Tat-Seng Chua, Yao Zhao
AAAI 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results on three benchmarks show that our AENet outperforms existing state-of-the-art ZSL methods. |
| Researcher Affiliation | Academia | Man Liu2,4, Huihui Bai1,2,4B, Feng Li3B, Chunjie Zhang2,4, Yunchao Wei2,4, Tat-Seng Chua5, Yao Zhao2,4 1 Tangshan Research Institute of Beijing Jiaotong University 2Institute of Information Science, Beijing Jiaotong University 3 Hefei University of Technology 4 Beijing Key Laboratory of Advanced Information Science and Network Technology 5 National University of Singapore |
| Pseudocode | No | The paper describes the methodology using textual explanations, mathematical equations, and a visual flowchart in Figure 2. It does not include any explicit pseudocode blocks or algorithms. |
| Open Source Code | Yes | Code https://github.com/Man Liu Coder/AENet. |
| Open Datasets | Yes | We conduct experiments on three standard benchmark datasets: Caltech-UCSD Birds-200-2011 (CUB) (Welinder et al. 2010), SUN Attribute (SUN) (Patterson and Hays 2012), Animals with Attributes2 (Aw A2) (Xian et al. 2019). |
| Dataset Splits | Yes | The categorization into seen and unseen categories follows the Proposed Split (PS) (Xian et al. 2019). The CUB dataset consists of 11,788 images illustrating 200 bird classes, with a split of 150/50 for seen/unseen classes... SUN is a vast scene dataset that contains 14,340 images spanning 717 classes, divided into seen/unseen classes at 645/72... Aw A2 contains 37,322 images of 50 animal classes, with a 40/10 split for seen/unseen classes... |
| Hardware Specification | Yes | Our framework is implemented using Py Torch and executed on an NVIDIA Ge Force RTX 3090 GPU. |
| Software Dependencies | No | Our framework is implemented using Py Torch and executed on an NVIDIA Ge Force RTX 3090 GPU. The paper mentions PyTorch but does not specify a version number or other software dependencies with versions. |
| Experiment Setup | Yes | The input image resolution is 224 224, with a patch size of 16 16. ... We sweep prompt length T {1, 3, 5, 7, 9} to investigate the effect of the prompts P on classification performance. ...we set T = 5 for CUB, SUN, and Aw A2 datasets. ...λcons and λdeb are the hyper-parameters controlling the weights of semantic consistency loss Lcons and the debiasing loss Ldeb, respectively. ... The best H is obtained when λcons = 1.0. ... Thus, we set λcons = 1.0 for optimal results. |