SHOT-VAE: Semi-supervised Deep Generative Models With Label-aware ELBO Approximations
Authors: Hao-Zhe Feng, Kezhi Kong, Minghao Chen, Tianye Zhang, Minfeng Zhu, Wei Chen7413-7421
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the SHOT-VAE model with sufficient experiments on four benchmark datasets, i.e. MNIST, SVHN, CIFAR-10, and CIFAR-100. In all experiments, we apply stochastic gradient descent (SGD) as optimizer with momentum 0.9 and multiply the learning rate by 0.1 at regularly scheduled epochs. For each experiment, we create five DL-DU splits with different random seeds and the error rates are reported by the mean and variance across splits. |
| Researcher Affiliation | Academia | 1 Zhejiang University 2 University of Maryland, College Park |
| Pseudocode | Yes | Algorithm 1 SHOT-VAE training process with epoch t. |
| Open Source Code | Yes | The code, with which the most important results can be reproduced, is available at Github2. 2https://github.com/Feng HZ/SHOT-VAE |
| Open Datasets | Yes | We evaluate the SHOT-VAE model with sufficient experiments on four benchmark datasets, i.e. MNIST, SVHN, CIFAR-10, and CIFAR-100. |
| Dataset Splits | No | The paper mentions training on "four benchmark datasets" and creating "five DL-DU splits with different random seeds" for labeled/unlabeled data, as well as "varying label ratios from 1.25% to 25%". However, it does not provide explicit details about the train, validation, and test splits (e.g., percentages or exact sample counts) for reproducibility. |
| Hardware Specification | No | The paper does not explicitly state any specific hardware details such as GPU or CPU models, memory, or cloud computing instance types used for the experiments. |
| Software Dependencies | No | The paper mentions using "stochastic gradient descent (SGD)" as an optimizer and applying "reparameterization tricks" and "β-VAE strategy", but it does not specify any software names with version numbers for libraries, frameworks, or specific tools used (e.g., PyTorch 1.9, TensorFlow 2.x). |
| Experiment Setup | Yes | In all experiments, we apply stochastic gradient descent (SGD) as optimizer with momentum 0.9 and multiply the learning rate by 0.1 at regularly scheduled epochs. We use ϵ = 0.001 in all experiments. The function of the exponential schedule is wt = exp( γ (1 t tmax )2), where γ is the hyper-parameter controlling the increasing speed, and we use γ = 5 in all experiments. we set a large batch-size (e.g., 512). we used N = 1 and τ = 0.67 in all experiments. we also take the widely-used β-VAE strategy (Burgess et al. 2018) in training process and chose β = 0.01 in all experiments. |