Mind the Gap when Conditioning Amortised Inference in Sequential Latent-Variable Models
Authors: Justin Bayer, Maximilian Soelch, Atanas Mirchev, Baris Kayalibay, Patrick van der Smagt
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate these theoretical findings in three scenarios: traffic flow, handwritten digits, and aerial vehicle dynamics. Using fully-conditioned approximate posteriors, performance improves in terms of generative modelling and multi-step prediction. We empirically show its effects in an extensive study on three real-world data sets on the common use case of variational state-space models. |
| Researcher Affiliation | Industry | Machine Learning Research Lab, Volkswagen Group, Munich, Germany {bayerj,m.soelch}@argmax.ai |
| Pseudocode | No | The paper describes its methods in text and mathematical formulations but does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code, such as a specific repository link or an explicit code release statement. |
| Open Datasets | Yes | We apply semi- and fully-conditioned VSSMs to UAV modelling (Antonini et al., 2018). We transformed the MNIST data set into a sequential data set... We consider the Seattle loop data set (Cui et al., 2019; 2020) of average speed measurements of cars at 323 locations on motorways near Seattle... |
| Dataset Splits | Yes | We go with a typical split into training, validation and test data. Each consists of 10,240 sequences of 25 time steps starting out at the origin. We selected the best model for semi- and fully-conditioned models separately according to the respective ELBOs on the validation set after 30,000 updates... The data was split into training, validation and testing data by months, January up to July for training, July to September for validation and the remainder for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models or processor types used for running its experiments. |
| Software Dependencies | No | The paper mentions optimizers and model architectures but does not provide specific ancillary software details with version numbers (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | See appendix C for details. (UAV) ... We conducted a hyper-parameter search of 64 experiments for 15,000 iterations. The 5 best experiments (according to the ELBO at the last iteration) were continued for 85,000 further iterations. (MNIST) ... A hyper parameter search of 128 configurations was conducted. (Traffic Flow). Appendices C.2, C.3, D.1, D.2, E.2, E.3, E.4 list detailed hyper-parameters like 'batch size 16', 'optimizer.learning rate 0.001', 'n latent 22' etc. |