Meta-Learning for Generalized Zero-Shot Learning
Authors: Vinay Kumar Verma, Dhanajit Brahma, Piyush Rai6062-6069
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform a comprehensive evaluation of our approach ZSML (Zero-Shot Meta-Learning) by applying it on both standard ZSL and generalized ZSL problems and compare it with several state-of-the-art methods. We also perform several ablation studies to demonstrate/disentangle the benefits of the various aspects of our proposed approach. |
| Researcher Affiliation | Academia | Vinay Kumar Verma, Dhanajit Brahma, Piyush Rai Department of Computer Science and Engineering, IIT Kanpur, India {vkverma, dhanajit, piyush}@cse.iitk.ac.in |
| Pseudocode | No | The paper states: 'Due to the lack of space, the complete Algorithm and details about the datasets are provided in the Supplementary Material.' However, no pseudocode or algorithm blocks are present in the provided text. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described in this paper, nor does it explicitly state that code will be released or is available in supplementary materials. |
| Open Datasets | Yes | We evaluate our approach on the following benchmark ZSL datasets: SUN (Xiao et al. 2010) and CUB (Welinder et al. 2010) which are fine-grained and considered very challenging; AWA1 (Lampert, Nickisch, and Harmeling 2009) and AWA2 (Xian et al. 2018a); a PY (Farhadi et al. 2009) with diverse classes that makes this dataset very challenging. |
| Dataset Splits | Yes | MAML splits this set further into a training set Ttr and a validation set Tval, i.e., Ti = {Ttr, Tval}. The split is done such that Ttr has very few examples per class. We follow the general notion of N-way K-shot problem (Vinyals et al. 2016) , i.e., Ttr contains N classes with K examples from each class. ... a key difference with MAML, to mimic the ZSL behaviour, is that for each task Ti = {Ttr, Tval}, the classes of Ttr and Tval are disjoint. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions specific software components like 'Wasserstein GAN (WGAN)' and 'model-agnostic meta learning (MAML)' and 'Res Net-101' features but does not provide specific version numbers for any software or libraries used in the experiments. |
| Experiment Setup | Yes | The generator and discriminator are 2-hidden layer networks with hidden layer size 2048 and 512, respectively. ... We choose the model-agnostic meta learning (MAML) (Finn, Abbeel, and Levine 2017) as our meta-learner due to its generic nature; it only requires a differentiable model and can work with any loss function. ... The model is trained using an episodic formulation where each round samples a batch of tasks and uses gradient-descent based updates (inner loop) for the parameters θi specific to each task Ti. ... Here, α is the hyper-parameter and L denotes the loss function being used. ... Here, β1 is the learning rate for the meta-step... |