Lifelong Zero-Shot Learning
Authors: Kun Wei, Cheng Deng, Xu Yang
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on these benchmarks indicate that our method tackles LZSL problem effectively, while existing ZSL methods fail. |
| Researcher Affiliation | Academia | Kun Wei , Cheng Deng , Xu Yang School of Electronic Engineering, Xidian University, Xi an 710071, China {weikunsk, chdeng.xd, xuyang.xd}@gmail.com |
| Pseudocode | Yes | Algorithm 1 The Process of Selective Retraining |
| Open Source Code | No | The paper does not contain any explicit statement about providing open-source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | We evaluate our method on four dataset: Attribute Pascal and Yahoo dataset (a PY) [Farhadi et al., 2009], Animals with Attributes 1 (AWA1) [Xian et al., 2018a], Caltech-UCSD-Birds 200-2011 dataset (CUB) [Wah et al., 2011], and SUN Attribute dataset (SUN) [Patterson and Hays, 2012]. |
| Dataset Splits | No | The paper defines metrics for unseen and seen test classes and discusses training and testing stages, but it does not specify a distinct validation set or its split percentages/counts for hyperparameter tuning. |
| Hardware Specification | No | The paper does not specify any particular GPU models, CPU models, or other hardware used for running the experiments. |
| Software Dependencies | No | The paper states “our method is implemented with Py Torch” but does not provide a specific version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | All encoders and decoders are multilayer perceptrons with one hidden layer. We use 1560 hidden units for the image feature encoder and 1660 for the decoder. The attribute encoder and decoder have 1450 and 660 hidden units, respectively. δ is increased from epoch 6 to epoch 22 by a rate of 0.54 per epoch, while γ is increased from epoch 21 to 75 by 0.044 per epoch. The weight λ of the KL-divergence is increased by a rate of 0.0026 per epoch until epoch 90. Besides, we use the L1 distance as reconstruction error, which obtains better results than L2. For every dataset, the number of epochs is set to 100, and the batch size is set to 50. The learning rate of VAEs is set to 0.00015, which is set to 0.001 for classifiers. In addition, our method is implemented with Py Torch and optimized by ADAM optimizer. |