Transductive Zero-Shot Learning with Visual Structure Constraint

Authors: Ziyu Wan, Dongdong Chen, Yan Li, Xingguang Yan, Junge Zhang, Yizhou Yu, Jing Liao

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on many widely used datasets demonstrate that the proposed visual structure constraint can bring substantial performance gain consistently and achieve state-of-the-art results.
Researcher Affiliation Collaboration 1 City University of Hong Kong 2 Microsoft Cloud+AI 3 PCG, Tencent 4 Shenzhen University 5 NLPR, CASIA 6 Deepwise AI Lab
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The source code is available at https://github.com/raywzy/VSC.
Open Datasets Yes extensive experiments are conducted on four widely-used ZSL benchmark datasets, i.e., Aw A1, Aw A2, CUB ,SUN10, and SUN72. Following the same configuration of previous methods, two different data split strategies are adopted: 1) Standard Splits (SS): The standard seen/unseen class split is first proposed in [17] and then widely used in most ZSL works. 2) Proposed Splits (PS): This split way is proposed by[32] to remove the overlapped Image Net-1K classes from target domain since it is used to pre-train the CNN model.
Dataset Splits Yes Following the same configuration of previous methods, two different data split strategies are adopted: 1) Standard Splits (SS): The standard seen/unseen class split is first proposed in [17] and then widely used in most ZSL works. 2) Proposed Splits (PS): This split way is proposed by[32] to remove the overlapped Image Net-1K classes from target domain since it is used to pre-train the CNN model.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions 'Adam optimizer' and 'pretrained Res Net-101' but does not specify software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Using Adam optimizer, our method is trained for 5000 epochs with a fixed learning rate of 0.0001. The weight β in CDVSc and BMVSc is cross-validated in [10 4, 10 3] and [10 5, 10 4] respectively, while WDVSc directly sets β = 0.001 because of its very stable performance. All images are resized to 224 224 without any data augmentation, and the dimension of extracted features is 2048. The hidden unit numbers of the two FC layers in the embedding network are both 2048. Both visual features and semantic attributes are L2-normalized.