HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning
Authors: Shiming Chen, Guosen Xie, Yang Liu, Qinmu Peng, Baigui Sun, Hao Li, Xinge You, Ling Shao
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on four benchmark datasets demonstrate that HSVA achieves superior performance on both conventional and generalized ZSL. |
| Researcher Affiliation | Collaboration | Huazhong University of Science and Technology (HUST), China Alibaba Group, Hangzhou, China Mohamed bin Zayed University of AI (MBZUAI), UAE Inception Institute of Artificial Intelligence (IIAI), UAE |
| Pseudocode | No | The paper describes the method using equations and text, but does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/shiming-chen/HSVA . |
| Open Datasets | Yes | We conduct extensive experiments on four well-known ZSL benchmark datasets, including fine-grained datasets (e.g., CUB [43] and SUN [44]) and coarse-grained datasets (e.g., AWA1 [4] and AWA2 [45]). |
| Dataset Splits | Yes | We use the training splits proposed in [46]. In the CZSL setting, we synthesize 800, 400 and 200 features per unseen class to train the classifier for AWA1, CUB and SUN datasets, respectively. In the GZSL setting, we take 400 synthesized features per unseen class and 200 synthesized features per seen class to train the classifier for all datasets. |
| Hardware Specification | No | The paper mentions a CNN backbone (Res Net-101) but does not provide specific details on the hardware (e.g., GPU model, CPU type) used for experiments. |
| Software Dependencies | No | The paper mentions the Adam optimizer and ResNet-101 but does not specify any software versions for libraries, frameworks, or programming languages used in the implementation. |
| Experiment Setup | Yes | We employ the Adam optimizer [47] with β1 = 0.5 and β2 = 0.999. We use an annealing scheme [48] to increase the weights γ, λ1, λ2, λ3 with same setting for all datasets. Specifically, γ is increased by a rate of 0.0026 per epoch until epoch 90, λ1 is increased from epoch 21 to 75 by 0.044 per epoch, and λ2, λ3 are increased from epoch 0 to epoch 22 by a rate of 0.54 per epoch. The dimensions of the structure-aligned and distribution-aligned spaces are set to 2048 and 64 for three datasets (i.e., CUB, AWA1, AWA2), respectively, and 2048 and 128 for SUN benchmark. |