Zero-Shot Learning via Class-Conditioned Deep Generative Models
Authors: Wenlin Wang, Yunchen Pu, Vinay Verma, Kai Fan, Yizhe Zhang, Changyou Chen, Piyush Rai, Lawrence Carin
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare our model with several state-of-the-art methods through a comprehensive set of experiments on a variety of benchmark data sets. |
| Researcher Affiliation | Academia | Wenlin Wang,1 Yunchen Pu,1 Vinay Kumar Verma,3 Kai Fan,2 Yizhe Zhang,2 Changyou Chen,4 Piyush Rai,3 Lawrence Carin1 1Department of Electrical and Computer Engineering, Duke University 2Compuational Biology and Bioinformatics, Duke University 3Department of Computer Science and Engineering, IIT Kanpur, India 4Department of Computer Science and Engineering, SUNY at Buffalo |
| Pseudocode | No | The paper provides mathematical formulations and descriptions of its model but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement about releasing source code or providing a link to it. |
| Open Datasets | Yes | We conduct our experiments on the following datasets: (i) Animal with Attributes (Aw A) (Lampert et al. 2014); (ii) Caltech-UCSD Birds-200-2011 (CUB-200) (Wah et al. 2011); and (iii) SUN attribute (SUN) (Patterson et al. 2012). For the large-scale dataset (Image Net), we follow (Fu et al. 2016), for which 1000 classes from ILSVRC2012 (Russakovsky et al. 2015) are used as seen classes, while 360 non-overlapped classes of ILSVRC2010 (Deng et al. 2009) are used as unseen classes. |
| Dataset Splits | Yes | For hyper-parameter selection, we divide the training set into training and validation set; the validation set is used for hyper-parameter tuning, while setting λ = 1 across all our experiments. |
| Hardware Specification | Yes | Our model is written in Tensorflow and trained on NVIDIA GTX TITAN X with 3072 cores and 11GB global memory. |
| Software Dependencies | No | The paper states, 'Our model is written in Tensorflow', but it does not specify a version number for Tensorflow or any other software dependencies. |
| Experiment Setup | Yes | For hyper-parameter selection, we divide the training set into training and validation set; the validation set is used for hyper-parameter tuning, while setting λ = 1 across all our experiments. For the VAE model, a multi-layer perceptron (MLP) is used for both encoder qφ(z|x) and decoder pθ(x|z). The encoder and decoder are defined by an MLP with two hidden layers, with 1000 nodes in each layer. Re LU is used as the nonlinear activation function on each hidden layer and dropout with constant rate 0.8 is used to avoid overfitting. The latent space z was set to be 100 for small datasets and 500 for Image Net. |