VS-Boost: Boosting Visual-Semantic Association for Generalized Zero-Shot Learning
Authors: Xiaofan Li, Yachao Zhang, Shiran Bian, Yanyun Qu, Yuan Xie, Zhongchao Shi, Jianping Fan
IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experimental results demonstrate that our method is effective and achieves significant gains on five benchmark datasets compared with the state-of-the-art methods. |
| Researcher Affiliation | Collaboration | Xiaofan Li1 , Yachao Zhang2 , Shiran Bian1 , Yanyun Qu1 , Yuan Xie3 , Zhongchao Shi4 and Jianping Fan4 1School of Informatics, Xiamen University, Fujian, China 2 Tsinghua University, Shenzhen, China 3School of Computer Science and Technology, East China Normal University, Shanghai, China 4Lenovo Research, Beijing, China |
| Pseudocode | No | The paper does not contain any pseudocode or clearly labeled algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We evaluate our method on the five benchmark datasets for zero-shot learning: Attribute Pascal and Yahoo (APY [Farhadi et al., 2009]), Animals with Attributes (AWA [Xian et al., 2018a]), Caltech-UCSD Birds-200-2011(CUB) [Welinder et al., 2010], Oxford Flowers (FLO) [Nilsback and Zisserman, 2008] and SUN Attribute (SUN) [Patterson and Hays, 2012]. |
| Dataset Splits | Yes | In addition, we adopt the Proposed Split(PS) [Xian et al., 2018a] to divide all classes on each dataset into seen and unseen classes. |
| Hardware Specification | Yes | The the proposed model is trained and evaluated on one Ge Force RTX 3090 GPU. |
| Software Dependencies | Yes | We implement our model by using Py Torch based on Python 3.7 platform. |
| Experiment Setup | Yes | Feature generator G, discriminator D, and semantic regressor R are multilayer perceptrons that contain a 4,096-unit hidden layer with Leaky Re LU activation. The feature encoder E is a [2048, 2048] Linear layer with Leaky Re LU activation. Finally, we use the task with N way, K shot (N-K) random sampling for training, and use a random mini-batch size of 8-64 for APY and AWA, 4-16 for CUB, 1-32 for FLO and 64-2 for SUN in our method. |