Revisiting Structured Variational Autoencoders
Authors: Yixiu Zhao, Scott Linderman
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | First, we develop a modern implementation for hardware acceleration, parallelization, and automatic differentiation of the message passing algorithms at the core of the SVAE. Second, we show that by exploiting structure in the prior, the SVAE learns more accurate models and posterior distributions, which translate into improved performance on prediction tasks. Third, we show how the SVAE can naturally handle missing data, and we leverage this ability to develop a novel, self-supervised training approach. Altogether, these results show that the time is ripe to revisit structured variational autoencoders. |
| Researcher Affiliation | Academia | Yixiu Zhao 1 Scott W. Linderman 2 1Applied Physics Department, Stanford University 2Department of Statistics and the Wu Tsai Neurosciences Institute, Stanford University. Correspondence to: Yixiu Zhao <yixiuz@stanford.edu>. |
| Pseudocode | No | The paper describes computational processes and algorithms, such as the Kalman smoother and parallel message passing, but it does not include any formally labeled "Pseudocode" or "Algorithm" blocks, nor does it present structured steps in a code-like format. |
| Open Source Code | Yes | Our implementation of the SVAE is available at https://github.com/lindermanlab/SVAE-Revisited. |
| Open Datasets | No | The paper describes generating its own "toy datasets sampled from randomly generated linear dynamical systems" and a "synthetic pendulum dataset adapted from Schirmer et al. (2022)". For the pendulum data, it states "A 24 x 24 pixel movie of a swinging pendulum is rendered with noise", indicating a custom generation. No specific link, DOI, or repository is provided for public access to these generated datasets. |
| Dataset Splits | No | The paper mentions "training data" (e.g., "100 sequences as the training data") and "test set" and reports "validation ELBO" in Appendix B. However, it does not provide explicit details about the percentage or number of samples for training, validation, and test splits, nor does it describe the methodology used to create these splits (e.g., random seed, stratified splitting). |
| Hardware Specification | Yes | Computational time (Tesla T4) |
| Software Dependencies | No | The paper mentions software libraries like JAX (Bradbury et al., 2018), Jax Opt (Blondel et al., 2022), and Dynamax (Chang et al., 2022). While these are specific tools, explicit version numbers for these libraries (e.g., 'PyTorch 1.9' or 'JAX 0.3.14') are not provided within the text, which is required for reproducibility. |
| Experiment Setup | Yes | For the SVAE, we use one linear layer with two linear readouts for the mean and covariance of the potential, since we know that the optimal recognition potential for LDS data is linearly related to the inputs. For the RNN models, we use various different recurrent state sizes H {10, 20, 30}, and we also add one to two hidden layers with Re LU nonlinearity to the output heads Aϕ,t, bϕ,t and Vϕ,t. For RNN-AR-NL we implement the nonlinear conditional distribution as a gated recurrent unit (GRU) cell. For CNN-AR-L, we use a 3-layer architecture with 32 features in each layer, and convolution kernels in the time dimension of sizes up to 50 timesteps. We use the same decoder network for all of the models, which is a one layer linear network with linear readouts for the output mean and covariance. |