Poisson Variational Autoencoder

Authors: Hadi Vafaii, Dekel Galor, Jacob Yates

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To evaluate the P-VAE, we perform three sets of experiments. First, we utilize the theoretical results for a linear decoder (eqs. (4) and (21)) to test the effectiveness of our reparameterization algorithm. We compare to alternative VAE models with established reparameterization tricks (e.g., Gaussian).
Researcher Affiliation Academia Hadi Vafaii1 Dekel Galor1 Jacob L. Yates1 vafaii@berkeley.edu galor@berkeley.edu yates@berkeley.edu 1UC Berkeley
Pseudocode Yes Algorithm 1 Reparameterized sampling (rsample) for Poisson distribution. Input: λ RB K >0 # rate parameter; B, batch size; K, latent dimensionality n_exp # number of exponential samples to generate temperature # controls the sharpness of the thresholding 1: procedure RSAMPLE(λ, n_exp, temperature) 2: Exp Exponential(λ) create exponential distribution 3: t Exp.rsample((n_exp, )) sample inter-event times, t : [n_exp B K] 4: times cumsum( t, dim=0) compute arrival times, same shape as t 5: indicator sigmoid 1 times temperature soft indicator for events within unit time 6: z sum(indicator, dim=0) event counts, or number of spikes, z : [B K] 7: return z 8: end procedure
Open Source Code Yes Our code, data, and model checkpoints are available at this repository: https://github.com/hadivafaii/Poisson VAE
Open Datasets Yes For sparse coding results, we use 101 natural images from the van Hateren dataset [104]. We tile the images to extract 16 16 patches and apply whitening and contrast normalization, as is typically done in sparse coding literature [3, 105]. To test the generalizability of our sparse coding results, we repeat these steps on CIFAR10 [106], a dataset we call CIFAR16 16. For the general representation learning results, we use MNIST.
Dataset Splits Yes van Hateren: #train = 107,520, #validation = 28,224, CIFAR16 16: #train = 200,000, #validation = 40,000. We use the MNIST dataset primarily for the downstream classification task. After the training is done, we use the following train/validation split to evaluate the models: K-nearest neighbor classification (tables 4 and 6): For this task, we only make use of the validation set for both training and testing of the classifier. We divide up the N = 10,000 validation samples into two disjoint sets of N = 5,000 samples each.
Hardware Specification Yes Training all models took roughly a week on 8 RTX 6000 Ada GPUs.
Software Dependencies No We thank our anonymous reviewers for their helpful comments, and the developers of the software packages used in this project, including Py Torch [97], Num Py [122], Sci Py [123], scikit-learn [124], pandas [125], matplotlib [126], and seaborn [127].
Experiment Setup Yes For lin|lin and conv|lin models, we used lr = 0.005, and for conv|conv models we used lr = 0.002. All models were trained using the Ada Max optimizer [148] with a cosine learning rate schedule [149]. Please see our code for the full details of training hyperparameters. Overall, we trained 195 VAE models, n = 5 seeds each, resulting in a total of 195 5 = 975 VAEs.