Markovian Gaussian Process Variational Autoencoders
Authors: Harrison Zhu, Carles Balsells-Rodas, Yingzhen Li
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, We study much longer datasets (T 100) compared to many previous GPVAE and discrete-time works, which are only are of the magnitude of T 10. We include a range of datasets that describe different properties of MGPVAE compared to existing approaches: We deliver competitive performance compared to many existing methods on corrupt and irregularly-sampled video and robot action data at a fraction of the cost of many existing models. We extend our work to spatiotemporal climate data, where none of the discrete-time sequential VAEs are suited for modelling. We show that it outperforms traditional GP and existing sparse GPVAE models in terms of both predictive performance and speed. |
| Researcher Affiliation | Academia | Harrison Zhu 1 * Carles Balsells-Rodas 1 * Yingzhen Li 1 1Imperial College London. Correspondence to: Harrison Zhu <harrisonzhu5080@gmail.com or hbz15@ic.ac.uk>. |
| Pseudocode | Yes | Algorithm 1 Kalman Filtering and Smoothing |
| Open Source Code | No | The paper mentions implementations in JAX, PyTorch, and TensorFlow and discusses rewriting SVGPVAE using Functorch and MGPVAE with Objax, but it does not provide a direct link to the authors' code for MGPVAE or state that it is publicly available. |
| Open Datasets | Yes | We create sequences of MNIST frames in which the digits are rotated with a periodic length of 50, over T = 100 frames. ... The Mujoco dataset is a physical simulation dataset generated using the Deepmind Control Suite (Tunyasuvunakool et al., 2020). ... We obtained climate data, including temperature and precipitation, from ERA5 using Google Earth Engine API (Gorelick et al., 2017). |
| Dataset Splits | Yes | We have 2 settings (1) 1280/400 train-test split with length T = 100 and (2) 320/100 for length T = 1000. ... SVGPVAE, VRNN, latent ODE and MGPVAE we use early stopping with the validation RMSE on 320 if T = 100 else 80 validation sequences (not part of train or test sets). |
| Hardware Specification | Yes | All wall-clock time computations are done on NVIDIA RTX-3090 GPUs with 24576Mi B RAM. |
| Software Dependencies | No | The paper mentions implementing models across 'JAX, Py Torch and Tensor Flow' and using 'Objax (Developers, 2020)' and 'Functorch (Horace He, 2021)', but it does not provide specific version numbers for these libraries. |
| Experiment Setup | Yes | Batch size: 40 Training epochs: 300 Number of latent channels: L = 16 Adam optimizer learning rate: 1e-3 clip Grad Norm(model parameters, 100) Encoder structure: Conv(out=32, k=3, strides=2), Re LU(), Conv2D(out=32, k=3, strides=2), Re LU(), Flatten(), hidden To Variational Params(), where hidden To Variational Params() depends on the model. Decoder structure: Linear(L, 8*8*32), Reshape((8,32,32)), Conv2DTranspose(out=64, k=3, strides=2, padding=same), Re LU(), Conv2DTranspose(out=32, k=3, strides=2, padding=same), Re LU(), Conv2DTranspose(out=1, k=3, strides=1, padding=same), Reshape((32, 32, 1)). |