VS-Boost: Boosting Visual-Semantic Association for Generalized Zero-Shot Learning

Authors: Xiaofan Li, Yachao Zhang, Shiran Bian, Yanyun Qu, Yuan Xie, Zhongchao Shi, Jianping Fan

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The experimental results demonstrate that our method is effective and achieves significant gains on five benchmark datasets compared with the state-of-the-art methods.
Researcher Affiliation Collaboration Xiaofan Li1 , Yachao Zhang2 , Shiran Bian1 , Yanyun Qu1 , Yuan Xie3 , Zhongchao Shi4 and Jianping Fan4 1School of Informatics, Xiamen University, Fujian, China 2 Tsinghua University, Shenzhen, China 3School of Computer Science and Technology, East China Normal University, Shanghai, China 4Lenovo Research, Beijing, China
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We evaluate our method on the five benchmark datasets for zero-shot learning: Attribute Pascal and Yahoo (APY [Farhadi et al., 2009]), Animals with Attributes (AWA [Xian et al., 2018a]), Caltech-UCSD Birds-200-2011(CUB) [Welinder et al., 2010], Oxford Flowers (FLO) [Nilsback and Zisserman, 2008] and SUN Attribute (SUN) [Patterson and Hays, 2012].
Dataset Splits Yes In addition, we adopt the Proposed Split(PS) [Xian et al., 2018a] to divide all classes on each dataset into seen and unseen classes.
Hardware Specification Yes The the proposed model is trained and evaluated on one Ge Force RTX 3090 GPU.
Software Dependencies Yes We implement our model by using Py Torch based on Python 3.7 platform.
Experiment Setup Yes Feature generator G, discriminator D, and semantic regressor R are multilayer perceptrons that contain a 4,096-unit hidden layer with Leaky Re LU activation. The feature encoder E is a [2048, 2048] Linear layer with Leaky Re LU activation. Finally, we use the task with N way, K shot (N-K) random sampling for training, and use a random mini-batch size of 8-64 for APY and AWA, 4-16 for CUB, 1-32 for FLO and 64-2 for SUN in our method.