Stacked Semantics-Guided Attention Model for Fine-Grained Zero-Shot Learning

Authors: yunlong yu, Zhong Ji, Yanwei Fu, Jichang Guo, Yanwei Pang, Zhongfei (Mark) Zhang

NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we carry out several experiments to evaluate the proposed S2GA networks on both zero-shot classification and zero-shot retrieval tasks.
Researcher Affiliation Collaboration Yunlong Yu, Zhong Ji School of Electrical and Information Engineering Tianjin University {yuyunlong,jizhong}@tju.edu.cn Yanwei Fu School of Data Science Fudan University AITRICS yanweifu@fudan.edu.cn Jichang Guo, Yanwei Pang School of Electrical and Information Engineering Tianjin University {jcguo,pyw}@tju.edu.cn Zhongfei (Mark) Zhang Computer Science Department Binghamton University zhongfei@cs.binghamton.edu
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes https://github.com/ylytju/sga
Open Datasets Yes Datasets: Following [6], we conduct experiments on two fine-grained bird datasets, CUB [22] and NABirds [21].
Dataset Splits Yes For an easy comparison available with the existing approaches, we use the same split as in [2] with 150 classes for training and 50 disjoint classes for testing. The architecture is trained for up to 3,000 iterations until the validation error has not improved in the last 30 iterations. Following [6, 33], we evaluate the approaches on two split settings: Super-Category-Shared (SCS) and Super-Category Exclusive (SCE).
Hardware Specification No The authors are very grateful for NVIDIA’s support in providing GPUs that made this work possible. However, specific GPU models or other detailed hardware specifications are not provided.
Software Dependencies No The whole architecture is implemented on the Tensorflow and trained end-to-end with fixed local visual features 2. However, no specific version numbers for Tensorflow or other software dependencies are provided.
Experiment Setup Yes In our system, the dimensionality d of the hidden layer and the batch size are set to 128 and 512, respectively. For optimization, the RMSProp method is used with a base learning rate of 10 4. The architecture is trained for up to 3,000 iterations until the validation error has not improved in the last 30 iterations.