SHOT-VAE: Semi-supervised Deep Generative Models With Label-aware ELBO Approximations

Authors: Hao-Zhe Feng, Kezhi Kong, Minghao Chen, Tianye Zhang, Minfeng Zhu, Wei Chen7413-7421

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the SHOT-VAE model with sufficient experiments on four benchmark datasets, i.e. MNIST, SVHN, CIFAR-10, and CIFAR-100. In all experiments, we apply stochastic gradient descent (SGD) as optimizer with momentum 0.9 and multiply the learning rate by 0.1 at regularly scheduled epochs. For each experiment, we create five DL-DU splits with different random seeds and the error rates are reported by the mean and variance across splits.
Researcher Affiliation Academia 1 Zhejiang University 2 University of Maryland, College Park
Pseudocode Yes Algorithm 1 SHOT-VAE training process with epoch t.
Open Source Code Yes The code, with which the most important results can be reproduced, is available at Github2. 2https://github.com/Feng HZ/SHOT-VAE
Open Datasets Yes We evaluate the SHOT-VAE model with sufficient experiments on four benchmark datasets, i.e. MNIST, SVHN, CIFAR-10, and CIFAR-100.
Dataset Splits No The paper mentions training on "four benchmark datasets" and creating "five DL-DU splits with different random seeds" for labeled/unlabeled data, as well as "varying label ratios from 1.25% to 25%". However, it does not provide explicit details about the train, validation, and test splits (e.g., percentages or exact sample counts) for reproducibility.
Hardware Specification No The paper does not explicitly state any specific hardware details such as GPU or CPU models, memory, or cloud computing instance types used for the experiments.
Software Dependencies No The paper mentions using "stochastic gradient descent (SGD)" as an optimizer and applying "reparameterization tricks" and "β-VAE strategy", but it does not specify any software names with version numbers for libraries, frameworks, or specific tools used (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup Yes In all experiments, we apply stochastic gradient descent (SGD) as optimizer with momentum 0.9 and multiply the learning rate by 0.1 at regularly scheduled epochs. We use ϵ = 0.001 in all experiments. The function of the exponential schedule is wt = exp( γ (1 t tmax )2), where γ is the hyper-parameter controlling the increasing speed, and we use γ = 5 in all experiments. we set a large batch-size (e.g., 512). we used N = 1 and τ = 0.67 in all experiments. we also take the widely-used β-VAE strategy (Burgess et al. 2018) in training process and chose β = 0.01 in all experiments.